Multilingual Autoregressive Entity Linking
Nicola De Cao, Ledell Wu, Kashyap Popat, Mikel Artetxe, Naman Goyal,, Mikhail Plekhanov, Luke Zettlemoyer, Nicola Cancedda, Sebastian Riedel, Fabio, Petroni

TL;DR
This paper introduces mGENRE, a multilingual autoregressive sequence-to-sequence model for entity linking that improves accuracy and efficiency by leveraging cross-lingual entity name matching and zero-shot capabilities.
Contribution
The paper presents a novel autoregressive approach for multilingual entity linking that effectively models cross-lingual interactions and enables fast, scalable search without large vector indices.
Findings
Achieves over 50% improvement in zero-shot language accuracy.
Establishes new state-of-the-art results on MEL benchmarks.
Demonstrates efficient cross-lingual entity linking without large-scale vector search.
Abstract
We present mGENRE, a sequence-to-sequence system for the Multilingual Entity Linking (MEL) problem -- the task of resolving language-specific mentions to a multilingual Knowledge Base (KB). For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token in an autoregressive fashion. The autoregressive formulation allows us to effectively cross-encode mention string and entity names to capture more interactions than the standard dot product between mention and entity vectors. It also enables fast search within a large KB even for mentions that do not appear in mention tables and with no need for large-scale vector indices. While prior MEL works use a single representation for each entity, we match against entity names of as many languages as possible, which allows exploiting language connections between source input and target name.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
