Efficient Multi-Vector Dense Retrieval Using Bit Vectors
Franco Maria Nardini, Cosimo Rulli, Rossano Venturini

TL;DR
This paper introduces EMVB, a novel framework that enhances multi-vector dense retrieval by combining bit vector pre-filtering, SIMD-optimized centroid interactions, product quantization, and term filtering, achieving faster retrieval with less memory.
Contribution
EMVB integrates multiple efficiency techniques into multi-vector dense retrieval, significantly improving speed and memory usage without sacrificing accuracy.
Findings
EMVB is up to 2.8x faster than PLAID.
EMVB reduces memory footprint by 1.8x.
EMVB maintains retrieval accuracy comparable to PLAID.
Abstract
Dense retrieval techniques employ pre-trained large language models to build a high-dimensional representation of queries and passages. These representations compute the relevance of a passage w.r.t. to a query using efficient similarity measures. In this line, multi-vector representations show improved effectiveness at the expense of a one-order-of-magnitude increase in memory footprint and query latency by encoding queries and documents on a per-token level. Recently, PLAID has tackled these problems by introducing a centroid-based term representation to reduce the memory impact of multi-vector systems. By exploiting a centroid interaction mechanism, PLAID filters out non-relevant documents, thus reducing the cost of the successive ranking stages. This paper proposes ``Efficient Multi-Vector dense retrieval with Bit vectors'' (EMVB), a novel framework for efficient query processing in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Algorithms and Data Compression
