Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph
Cong Fu, Chao Xiang, Changxu Wang, Deng Cai

TL;DR
This paper introduces the Navigating Spreading-out Graph (NSG), a scalable and efficient graph-based method for approximate nearest neighbor search that significantly outperforms existing algorithms and is suitable for billion-node datasets.
Contribution
The paper proposes the NSG structure, approximating the Monotonic Relative Neighborhood Graph to improve search efficiency and scalability in billion-node ANNS applications.
Findings
NSG achieves near-logarithmic search complexity.
NSG outperforms existing algorithms on large-scale datasets.
NSG is integrated into Alibaba's Taobao search engine.
Abstract
Approximate nearest neighbor search (ANNS) is a fundamental problem in databases and data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some early graph-based approaches have shown attractive theoretical guarantees on search time complexity, but they all suffer from the problem of high indexing time complexity. Recently, some graph-based methods have been proposed to reduce indexing complexity by approximating the traditional graphs; these methods have achieved revolutionary performance on million-scale datasets. Yet, they still can not scale to billion-node databases. In this paper, to further improve the search-efficiency and scalability of graph-based methods, we start by introducing four aspects: (1) ensuring the connectivity of the graph; (2) lowering the average out-degree of the graph for fast traversal; (3) shortening the search path; and (4)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Advanced Image and Video Retrieval Techniques · Web Data Mining and Analysis
