Col-Bandit: Zero-Shot Query-Time Pruning for Late-Interaction Retrieval
Roi Pony, Adi Raz, Oshri Naparstek, Idan Friedman, Udi Barzelay

TL;DR
Col-Bandit is a zero-shot query-time pruning method that significantly reduces computational costs in late-interaction retrieval systems by adaptively identifying and computing only the necessary token interactions for accurate ranking.
Contribution
It introduces a novel, uncertainty-aware pruning algorithm that sparsifies late-interaction matrices on the fly without requiring offline preprocessing or retraining.
Findings
Reduces MaxSim FLOPs by up to 5× on benchmarks
Preserves ranking quality while pruning interactions
Operates as a zero-shot, drop-in layer for existing systems
Abstract
Multi-vector late-interaction retrievers such as ColBERT achieve state-of-the-art retrieval quality, but their query-time cost is dominated by exhaustively computing token-level MaxSim interactions for every candidate document. While approximating late interaction with single-vector representations reduces cost, it often incurs substantial accuracy loss. We introduce Col-Bandit, a query-time pruning algorithm that reduces this computational burden by casting reranking as a finite-population Top- identification problem. Col-Bandit maintains uncertainty-aware bounds over partially observed document scores and adaptively reveals only the (document, query token) MaxSim entries needed to determine the top results under statistical decision bounds with a tunable relaxation. Unlike coarse-grained approaches that prune entire documents or tokens offline, Col-Bandit sparsifies the interaction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Natural Language Processing Techniques
