On Single and Multiple Representations in Dense Passage Retrieval
Craig Macdonald, Nicola Tonellotto, Iadh Ounis

TL;DR
This paper compares single and multiple dense passage retrieval methods, finding that multiple representations generally outperform single ones in effectiveness, especially for complex or difficult queries, despite being less efficient.
Contribution
It provides a direct comparison of single and multiple dense retrieval methods, highlighting their relative strengths and weaknesses across different query types.
Findings
Multiple representations outperform single representations in MAP and MRR@10.
Multiple representations are more effective for complex, definitional, and difficult queries.
Single representations like ANCE are more efficient in response time and memory usage.
Abstract
The advent of contextualised language models has brought gains in search effectiveness, not just when applied for re-ranking the output of classical weighting models such as BM25, but also when used directly for passage indexing and retrieval, a technique which is called dense retrieval. In the existing literature in neural ranking, two dense retrieval families have become apparent: single representation, where entire passages are represented by a single embedding (usually BERT's [CLS] token, as exemplified by the recent ANCE approach), or multiple representations, where each token in a passage is represented by its own embedding (as exemplified by the recent ColBERT approach). These two families have not been directly compared. However, because of the likely importance of dense retrieval moving forward, a clear understanding of their advantages and disadvantages is paramount. To this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
