Scalable Distributed Vector Search via Accuracy Preserving Index Construction
Yuming Xu, Qianxi Zhang, Qi Chen, Baotong Lu, Menghao Li, Philip Adams, Mingqin Li, Zengzhong Li, Jing Liu, Cheng Li, Fan Yang

TL;DR
This paper introduces SPIRE, a scalable distributed vector index for billions of vectors that maintains accuracy and efficiency, outperforming existing systems in throughput and scalability.
Contribution
SPIRE's novel balanced partitioning and recursive multi-level construction enable scalable, accurate, and high-throughput approximate nearest neighbor search at billions of vectors.
Findings
Achieves up to 9.64X higher throughput than state-of-the-art systems.
Handles up to 8 billion vectors across 46 nodes.
Maintains stable accuracy with predictable search costs.
Abstract
Scaling Approximate Nearest Neighbor Search (ANNS) to billions of vectors requires distributed indexes that balance accuracy, latency, and throughput. Yet existing index designs struggle with this tradeoff. This paper presents SPIRE, a scalable vector index based on two design decisions. First, it identifies a balanced partition granularity that avoids read-cost explosion. Second, it introduces an accuracy-preserving recursive construction that builds a multi-level index with predictable search cost and stable accuracy. In experiments with up to 8 billion vectors across 46 nodes, SPIRE achieves high scalability and up to 9.64X higher throughput than state-of-the-art systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Graph Theory and Algorithms · Advanced Database Systems and Queries
