CleANN: Efficient Full Dynamism in Graph-based Approximate Nearest Neighbor Search
Ziyu Zhang, Yuanhao Wei, Joshua Engels, Julian Shun

TL;DR
CleANN is a novel graph-based approximate nearest neighbor search system that efficiently supports full dynamism, including updates and searches, while maintaining high query quality and significantly improving throughput on large datasets.
Contribution
The paper introduces CleANN, the first concurrent ANNS index that effectively handles full dynamism with high efficiency and maintained quality, addressing limitations of previous static and dynamic methods.
Findings
Achieves 7-1200x throughput improvement on large datasets.
Maintains query quality comparable to static indexes.
Operates efficiently with concurrent updates and searches.
Abstract
Approximate nearest neighbor search (ANNS) has become a quintessential algorithmic problem for various other foundational data tasks for AI workloads. Graph-based ANNS indexes have superb empirical trade-offs in indexing cost, query efficiency, and query approximation quality. Most existing graph-based indexes are designed for the static scenario, where there are no updates to the data after the index is constructed. However, full dynamism (insertions, deletions, and searches) is crucial to providing up-to-date responses in applications using vector databases. It is desirable that the index efficiently supports updates and search queries concurrently. Existing dynamic graph-based indexes suffer from at least one of the following problems: (1) the query quality degrades as updates happen; and (2) the graph structure updates used to maintain the index quality upon updates are global and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
