Multivector Reranking in the Era of Strong First-Stage Retrievers
Silvio Martinico, Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

TL;DR
This paper improves multivector retrieval efficiency by replacing token-level gather with a learned sparse retriever, integrating inference-free methods, and optimizing reranking, achieving significant speedups without quality loss.
Contribution
It introduces a two-stage retrieval pipeline using a learned sparse retriever for candidate gathering, combined with inference-free methods and candidate pruning, enhancing efficiency while maintaining effectiveness.
Findings
Achieves over 24x speedup compared to state-of-the-art systems.
Maintains retrieval quality with up to 1.8x efficiency improvements.
Replaces token-level gather with a learned sparse retriever for better candidate selection.
Abstract
Learned multivector representations power modern search systems with strong retrieval effectiveness, but their real-world use is limited by the high cost of exhaustive token-level retrieval. Therefore, most systems adopt a \emph{gather-and-refine} strategy, where a lightweight gather phase selects candidates for full scoring. However, this approach requires expensive searches over large token-level indexes and often misses the documents that would rank highest under full similarity. In this paper, we reproduce several state-of-the-art multivector retrieval methods on two publicly available datasets, providing a clear picture of the current multivector retrieval field and observing the inefficiency of token-level gathering. Building on top of that, we show that replacing the token-level gather phase with a single-vector document retriever -- specifically, a learned sparse retriever (LSR)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Advanced Image and Video Retrieval Techniques · Topic Modeling
