Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search
Shuo Yang, Jiadong Xie, Yingfan Liu, Jeffrey Xu Yu, Xiyue Gao, Qianru, Wang, Yanguo Peng, Jiangtao Cui

TL;DR
This paper proposes a new framework to accelerate proximity graph construction for approximate nearest neighbor search, significantly reducing build time without sacrificing search accuracy.
Contribution
It introduces a novel pruning-based construction framework for RNG and NSWG, improving efficiency and scalability of PG-based $k$-ANN methods.
Findings
Achieves up to 5.6x speedup in graph construction
Maintains comparable $k$-ANN search performance
Enhances scalability for large high-dimensional datasets
Abstract
Proximity graphs (PG) have gained increasing popularity as the state-of-the-art solutions to -approximate nearest neighbor (-ANN) search on high-dimensional data, which serves as a fundamental function in various fields, e.g., retrieval-augmented generation. Although PG-based approaches have the best -ANN search performance, their index construction cost is superlinear to the number of points. Such superlinear cost substantially limits their scalability in the era of big data. Hence, the goal of this paper is to accelerate the construction of PG-based methods without compromising their -ANN search performance. To achieve this goal, two mainstream categories of PG are revisited: relative neighborhood graph (RNG) and navigable small world graph (NSWG). By revisiting their construction process, we find the issues of construction efficiency. To address these issues, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Rough Sets and Fuzzy Logic · Multi-Criteria Decision Making
