Probabilistic Bag-Of-Hyperlinks Model for Entity Linking
Octavian-Eugen Ganea, Marina Ganea, Aurelien Lucchi, Carsten Eickhoff,, Thomas Hofmann

TL;DR
This paper introduces a probabilistic graphical model for entity linking that jointly disambiguates mentions in a document using minimal parameters, achieving high accuracy efficiently.
Contribution
It presents a simple, fast, and effective probabilistic model for collective entity disambiguation that requires minimal feature engineering and training.
Findings
Outperforms many existing methods on benchmark datasets
Operates efficiently in real-time scenarios
Uses few parameters with simple sufficient statistics
Abstract
Many fundamental problems in natural language processing rely on determining what entities appear in a given text. Commonly referenced as entity linking, this step is a fundamental component of many NLP tasks such as text understanding, automatic summarization, semantic search or machine translation. Name ambiguity, word polysemy, context dependencies and a heavy-tailed distribution of entities contribute to the complexity of this problem. We here propose a probabilistic approach that makes use of an effective graphical model to perform collective entity disambiguation. Input mentions (i.e.,~linkable token spans) are disambiguated jointly across an entire document by combining a document-level prior of entity co-occurrences with local information captured from mentions and their surrounding context. The model is based on simple sufficient statistics extracted from data, thus relying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
