Range Retrieval with Graph-Based Indices
Magdalen Dobson Manohar, Taekseung Kim, and Guy E. Blelloch

TL;DR
This paper introduces new graph-based range retrieval algorithms that efficiently find all points within a specified distance in high-dimensional datasets, significantly improving throughput over existing methods.
Contribution
The paper presents novel range retrieval algorithms for graph-based indices, along with a comprehensive dataset analysis and benchmarking for large-scale high-dimensional data.
Findings
Up to 100x faster query throughput compared to standard graph search.
Up to 10x improvement over previous range search modifications.
Effective performance on datasets with up to 1 billion points.
Abstract
Retrieving points based on proximity in a high-dimensional vector space is a crucial step in information retrieval applications. The approximate nearest neighbor search (ANNS) problem, which identifies the nearest neighbors for a query, has been extensively studied in recent years. However, comparatively little attention has been paid to the related problem of finding all points within a given distance of a query, the range retrieval problem, despite its applications in areas such as duplicate detection, plagiarism checking, and facial recognition. In this paper, we present new techniques for range retrieval on graph-based vector indices, which are known to achieve excellent performance on ANNS queries. Since a range query may have anywhere from no matching results to thousands of matching results in the database, we introduce a set of range retrieval algorithms based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training
