Multi-Vector Index Compression in Any Modality

Hanxiang Qin; Alexander Martin; Rohan Jha; Chunsheng Zuo; Reno Kriz; Benjamin Van Durme

arXiv:2602.21202·cs.IR·February 25, 2026

Multi-Vector Index Compression in Any Modality

Hanxiang Qin, Alexander Martin, Rohan Jha, Chunsheng Zuo, Reno Kriz, Benjamin Van Durme

PDF

Open Access 10 Models

TL;DR

This paper introduces novel index compression techniques for multi-vector retrieval across various modalities, significantly reducing storage costs while maintaining or improving retrieval performance.

Contribution

It proposes four index compression methods, including a new attention-guided clustering approach, applicable to text, images, and videos, with comprehensive evaluation across multiple datasets.

Findings

01

Attention-guided clustering outperforms other compression methods

02

Compressed indexes achieve comparable or better retrieval performance

03

Methods are effective across text, visual, and video modalities

Abstract

We study efficient multi-vector retrieval for late interaction in any modality. Late interaction has emerged as a dominant paradigm for information retrieval in text, images, visual documents, and videos, but its computation and storage costs grow linearly with document length, making it costly for image-, video-, and audio-rich corpora. To address this limitation, we explore query-agnostic methods for compressing multi-vector document representations under a constant vector budget. We introduce four approaches for index compression: sequence resizing, memory tokens, hierarchical pooling, and a novel attention-guided clustering (AGC). AGC uses an attention-guided mechanism to identify the most semantically salient regions of a document as cluster centroids and to weight token aggregation. Evaluating these methods on retrieval tasks spanning text (BEIR), visual-document (ViDoRe), and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization