TaCo: Data-adaptive and Query-aware Subspace Collision for High-dimensional Approximate Nearest Neighbor Search
Jiuqi Wei, Zhenyu Liao, Ruoyu Han, Quanqing Xu, Chuanhui Yang, Themis Palpanas

TL;DR
TaCo is a novel data-adaptive and query-aware subspace collision framework for high-dimensional approximate nearest neighbor search, significantly improving indexing speed, memory efficiency, and query throughput over existing methods.
Contribution
We introduce a data-adaptive and query-aware subspace collision method that balances index construction and query performance for high-dimensional ANNS.
Findings
Up to 8x faster indexing compared to state-of-the-art methods.
Reduces memory footprint to 0.6x of previous approaches.
Achieves over 1.5x higher query throughput.
Abstract
Approximate Nearest Neighbor Search (ANNS) in high-dimensional Euclidean spaces is a fundamental problem with broad applications. Subspace Collision is a newly proposed ANNS framework that provides a novel paradigm for similarity search and achieves superior indexing and query performance. However, the subspace collision framework remains data-agnostic and query-oblivious, resulting in imbalanced index construction and wasted query overhead. In this paper, we address these limitations from two aspects: first, we design a subspace-oriented data transformation mechanism by averaging the entropies computed over each subspace of the transformed data, which ensures balanced subspace partitioning (in an information theoretical sense) and enables data-adaptive subspace collision; second, we present query-aware and scalable query strategies that dynamically allocate overhead for each query and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Robotic Path Planning Algorithms · Advanced Image and Video Retrieval Techniques
