Early Stage Sparse Retrieval with Entity Linking
Dahlia Shehata, Negar Arabzadeh, Charles L. A. Clarke

TL;DR
This paper enhances sparse retrieval methods by integrating entity linking to expand queries and documents, narrowing the performance gap with dense models while maintaining efficiency, especially in early-stage retrieval tasks.
Contribution
It introduces a novel entity linking-based expansion technique for sparse retrievers, improving recall and complementarity in retrieval results without heavy computational costs.
Findings
Entity linking expansion improves recall@1000.
Expanded and non-expanded runs retrieve complementary results.
Run fusion maximizes retrieval effectiveness.
Abstract
Despite the advantages of their low-resource settings, traditional sparse retrievers depend on exact matching approaches between high-dimensional bag-of-words (BoW) representations of both the queries and the collection. As a result, retrieval performance is restricted by semantic discrepancies and vocabulary gaps. On the other hand, transformer-based dense retrievers introduce significant improvements in information retrieval tasks by exploiting low-dimensional contextualized representations of the corpus. While dense retrievers are known for their relative effectiveness, they suffer from lower efficiency and lack of generalization issues, when compared to sparse retrievers. For a lightweight retrieval task, high computational resources and time consumption are major barriers encouraging the renunciation of dense models despite potential gains. In this work, we propose boosting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Information Retrieval and Search Behavior
