Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP
Anthony Chen, Pallavi Gudipati, Shayne Longpre, Xiao Ling, Sameer, Singh

TL;DR
This paper introduces AmbER sets, a benchmark for evaluating entity disambiguation in retrieval-based NLP, revealing that current systems favor popular entities and struggle with rarer ones.
Contribution
The paper proposes AmbER sets as a new benchmark for assessing entity disambiguation, highlighting the prevalence of popularity bias in retrieval systems.
Findings
Retrievers perform poorly on less popular entities sharing the same name.
Popularity bias causes twice as many errors for rarer entities.
AmbER sets effectively evaluate entity disambiguation capabilities.
Abstract
Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple entities can share a name, making disambiguation an inherent yet under-explored problem. We propose an evaluation benchmark for assessing the entity disambiguation capabilities of these retrievers, which we call Ambiguous Entity Retrieval (AmbER) sets. We define an AmbER set as a collection of entities that share a name along with queries about those entities. By covering the set of entities for polysemous names, AmbER sets act as a challenging test of entity disambiguation. We create AmbER sets for three popular open-domain tasks: fact checking, slot filling, and question answering, and evaluate a diverse set of retrievers. We find that the retrievers exhibit popularity bias, significantly under-performing on rarer entities that share a name, e.g., they are twice as likely to retrieve erroneous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
