SPFresh: Incremental In-Place Update for Billion-Scale Vector Search
Yuming Xu, Hengyu Liang, Jin Li, Shuotao Xu, Qi Chen, Qianxi Zhang,, Cheng Li, Ziyue Yang, Fan Yang, Yuqing Yang, Peng Cheng, Mao Yang

TL;DR
SPFresh introduces an in-place update system for billion-scale vector search that significantly reduces latency and resource usage compared to traditional global rebuild methods, enabling efficient, accurate, and scalable vector index updates.
Contribution
The paper presents SPFresh, a novel system with LIRE protocol for low-overhead, in-place vector index updates supporting billion-scale data, improving efficiency and accuracy.
Findings
SPFresh achieves lower latency than global rebuild methods.
Uses only 1% of DRAM and less than 10% of cores at peak.
Supports 1% daily vector update rate at billion scale.
Abstract
Approximate Nearest Neighbor Search (ANNS) is now widely used in various applications, ranging from information retrieval, question answering, and recommendation, to search for similar high-dimensional vectors. As the amount of vector data grows continuously, it becomes important to support updates to vector index, the enabling technique that allows for efficient and accurate ANNS on vectors. Because of the curse of high dimensionality, it is often costly to identify the right neighbors of a single new vector, a necessary process for index update. To amortize update costs, existing systems maintain a secondary index to accumulate updates, which are merged by the main index by global rebuilding the entire index periodically. However, this approach has high fluctuations of search latency and accuracy, not even to mention that it requires substantial resources and is extremely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
