Speed-ANN: Low-Latency and High-Accuracy Nearest Neighbor Search via Intra-Query Parallelism
Zhen Peng, Minjia Zhang, Kai Li, Ruoming Jin, Bin Ren

TL;DR
Speed-ANN introduces a novel intra-query parallelism approach for similarity graph-based nearest neighbor search, significantly reducing latency and improving scalability on large datasets using multi-core CPUs.
Contribution
It proposes Speed-ANN, a parallel search algorithm leveraging intra-query parallelism and memory hierarchy to enhance speed and accuracy in high-dimensional NNS tasks.
Findings
Speed-ANN achieves shorter query latency than NSG and HNSW.
It outperforms GPU implementations in search speed.
It scales effectively with increasing CPU cores and dataset sizes.
Abstract
Nearest Neighbor Search (NNS) has recently drawn a rapid increase of interest due to its core role in managing high-dimensional vector data in data science and AI applications. The interest is fueled by the success of neural embedding, where deep learning models transform unstructured data into semantically correlated feature vectors for data analysis, e.g., recommend popular items. Among several categories of methods for fast NNS, similarity graph is one of the most successful algorithmic trends. Several of the most popular and top-performing similarity graphs, such as NSG and HNSW, at their core employ best-first traversal along the underlying graph indices to search near neighbors. Maximizing the performance of the search is essential for many tasks, especially at the large-scale and high-recall regime. In this work, we provide an in-depth examination of the challenges of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms
