Information Retrieval with Entity Linking
Dahlia Shehata

TL;DR
This paper enhances sparse information retrieval by integrating entity linking to expand queries and documents, aiming to narrow the effectiveness gap with dense models while maintaining efficiency.
Contribution
It introduces a novel method of augmenting sparse retrievers with linked entities in explicit and hashed formats using zero-shot entity linking, improving retrieval performance.
Findings
Improved recall@1000 on MS MARCO dataset
Enhanced retrieval for difficult query subsets
Narrowed effectiveness gap between sparse and dense retrievers
Abstract
Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result, retrieval performance is restricted by semantic discrepancies and vocabulary gaps. On the other hand, transformer-based dense retrievers introduce significant improvements in information retrieval tasks by exploiting low-dimensional contextualized representations of the corpus. While dense retrievers are known for their relative effectiveness, they suffer from lower efficiency and lack of generalization issues, when compared to sparse retrievers. For a lightweight retrieval task, high computational resources and time consumption are major barriers encouraging the renunciation of dense models despite potential gains. In this work, I propose boosting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Topic Modeling · Domain Adaptation and Few-Shot Learning
MethodsSparse Evolutionary Training · Attention Model
