BabelNet is an innovative multilingual encyclopedic dictionary,
with wide lexicographic and encyclopedic coverage of terms, and a semantic network/ontology
which connects concepts and named entities in a very large network of semantic relations,
made up of about 23 million entries.
Conceived within the Sapienza NLP Group, engineered and maintained by Babelscape,
BabelNet follows the WordNet model based on the notion of synset (for synonym set),
but extends it to contain multilingual lexicalizations.
Each BabelNet synset represents a given meaning and contains all the synonyms which
express that meaning in a range of different languages.
BabelNet 5.3 covers 600 languages and is obtained from the automatic integration of:
WordNet,
the most popular computational lexicon of English (version 3.0).
Open English WordNet,
a fork of the Princeton Wordnet developed under an open source methodology (October 2023 release).
Wikipedia,
the largest collaborative multilingual Web encyclopedia (November 2023 dump).
OmegaWiki,
a large collaborative multilingual dictionary (January 2017 dump).
Wiktionary,
a collaborative project to produce a free-content multilingual dictionary (November 2023 dump).
Wikidata,
a free knowledge base that can be read and edited by humans and machines alike (November 2023 dump).
GeoNames,
a free geographical database covering all countries and containing over eight million placenames (October 2020 dump).
ImageNet,
an image database organized according to the WordNet hierarchy (2011 release).
BabelPic,
a large collection of non-concrete pictures.
VerbAtlas,
the largest language-independent verb predicate and role resource.
HeTOP Q-Codes,
a large multilingual health-related lexicon.
Translations obtained from sense-annotated sentences.
BabelNet is linked to different resources and applications from the
Sapienza NLP group:
VerbAtlas:
a large multilingual verb predicate and role repository.
InVeRo:
intelligible verbs and roles produced by a state-of-the-art neural Semantic Role Labeling system.
Train-O-Matic:
the first large scale silver data creation approach to multilingual Word Sense Disambiguation.
MuLaN:
silver data creation for Word Sense Disambiguation by means of multilingual label propagation.
OneSeC,
SensEmBERT and
ARES:
latent Transformer-based sense representations which achieve state-of-the-art performance in
multilingual Word Sense Disambiguation.
Conception:
human-intelligibile multilingual representations of BabelNet synsets.
SyntagNet:
a large collection of disambiguated free word associations and collocations.
SyntagRank:
a SyntagNet- and BabelNet-based multilingual word sense disambiguation system.
Babelfy:
a multilingual disambiguation and entity linking system.
Wikipedia Bitaxonomy:
a state-of-the-art taxonomy of Wikipedia pages aligned to a taxonomy of Wikipedia categories.
BabelNet has received funding from the European Research Council (ERC)
under the European Union's FP7 specific programme 'Ideas' under grant agreement
no. 259234 (MultiJEDI
ERC StG).
Preferred languages to be displayed in the selection menus
Please note that, in order to improve your browsing experience on this website,
BabelNet® uses various types of cookies, including: browsing functionality,
performance and statistical cookies.
By continuing to browse the site you are agreeing to our use of cookies.
OK