Reducing the Footprint of Multi-Vector Retrieval with Minimal   Performance Impact via Token Pooling

Benjamin Clavi\'e; Antoine Chaffin; Griffin Adams

arXiv:2409.14683·cs.IR·September 24, 2024

Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Benjamin Clavi\'e, Antoine Chaffin, Griffin Adams

PDF

Open Access

TL;DR

This paper presents a clustering-based token pooling method that significantly reduces the storage requirements of multi-vector retrieval systems like ColBERT, with minimal impact on retrieval performance, enabling more practical deployment.

Contribution

A simple, architecture-agnostic token pooling technique that halves memory usage in multi-vector retrieval without degrading accuracy.

Findings

01

Reduces index size by 50% with negligible performance loss

02

Further reductions of 66%-75% with less than 5% degradation

03

No changes needed to existing models or query processing

Abstract

Over the last few years, multi-vector retrieval methods, spearheaded by ColBERT, have become an increasingly popular approach to Neural IR. By storing representations at the token level rather than at the document level, these methods have demonstrated very strong retrieval performance, especially in out-of-domain settings. However, the storage and memory requirements necessary to store the large number of associated vectors remain an important drawback, hindering practical adoption. In this paper, we introduce a simple clustering-based token pooling approach to aggressively reduce the number of vectors that need to be stored. This method can reduce the space & memory footprint of ColBERT indexes by 50% with virtually no retrieval performance degradation. This method also allows for further reductions, reducing the vector count by 66%-to-75% , with degradation remaining below 5% on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and Algorithms · Machine Learning and Data Classification