Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search
Jiuqi Wei, Xiaodong Lee, Zhenyu Liao, Themis Palpanas, Botao Peng

TL;DR
This paper introduces Subspace Collision, a novel ANN search framework with theoretical guarantees that significantly improves speed and memory efficiency over existing methods, especially on challenging high-dimensional datasets.
Contribution
The paper proposes the SC-score metric and the Subspace Collision framework, providing the first efficient ANN method with rigorous theoretical guarantees.
Findings
SuCo outperforms state-of-the-art ANN methods in speed and memory efficiency
SuCo achieves 10-100 times faster query answering with less memory
SuCo performs best on hard datasets, matching or surpassing methods without guarantees
Abstract
Approximate Nearest Neighbor (ANN) search in high-dimensional Euclidean spaces is a fundamental problem with a wide range of applications. However, there is currently no ANN method that performs well in both indexing and query answering performance, while providing rigorous theoretical guarantees for the quality of the answers. In this paper, we first design SC-score, a metric that we show follows the Pareto principle and can act as a proxy for the Euclidean distance between data points. Inspired by this, we propose a novel ANN search framework called Subspace Collision (SC), which can provide theoretical guarantees on the quality of its results. We further propose SuCo, which achieves efficient and accurate ANN search by designing a clustering-based lightweight index and query strategies for our proposed subspace collision framework. Extensive experiments on real-world datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Metaheuristic Optimization Algorithms Research · Video Surveillance and Tracking Methods
