GEM: A Native Graph-based Index for Multi-Vector Retrieval
Yao Tian, Zhoujin Tian, Xi Zhao, Ruiyuan Zhang, Xiaofang Zhou

TL;DR
GEM introduces a native graph-based indexing method for multi-vector retrieval that preserves semantic richness and significantly improves search speed over existing approaches.
Contribution
The paper presents GEM, a novel multi-vector indexing framework that constructs a proximity graph directly over vector sets, enabling efficient and accurate retrieval.
Findings
Up to 16x speedup over state-of-the-art methods
Maintains or improves retrieval accuracy
Effective in in-domain, out-of-domain, and multi-modal benchmarks
Abstract
In multi-vector retrieval, both queries and data are represented as sets of high-dimensional vectors, enabling finer-grained semantic matching and improving retrieval quality over single-vector approaches. However, its practical adoption is held back by the lack of effective indexing algorithms. Existing work, attempting to reuse standard single-vector indexes, often fails to preserve multi-vector semantics or remains slow. In this work, we present GEM, a native indexing framework for multi-vector representations. The core idea is to construct a proximity graph directly over vector sets, preserving their fine-grained semantics while enabling efficient navigation. First, GEM designs a set-level clustering scheme. It associates each vector set with only its most informative clusters, effectively reducing redundancy without hurting semantic coverage. Then, it builds local proximity graphs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Information Retrieval and Search Behavior · Advanced Image and Video Retrieval Techniques
