Elastic Index Selection for Label-Hybrid AKNN Search
Mingyu Yang, Wenxuan Xia, Wentao Li, Raymond Chi-Wing Wong, Wei Wang

TL;DR
This paper proposes an elastic index selection method for label-hybrid approximate k-nearest neighbor search, leveraging label set inclusion to improve efficiency and scalability while maintaining performance.
Contribution
It introduces elastic factor bounds and a greedy algorithm for index selection, enabling efficient, scalable label-hybrid AKNN search with performance guarantees.
Findings
Achieves up to 500x search speedup over state-of-the-art methods.
Demonstrates effectiveness on multiple real datasets.
Provides a versatile, library-compatible solution.
Abstract
Real-world vector embeddings are usually associated with extra labels, such as attributes and keywords. Many applications require the nearest neighbor search that contains specific labels, such as searching for product image embeddings restricted to a particular brand. A straightforward approach is to materialize all possible indices according to the complete query label workload. However, this leads to an exponential increase in both index space and processing time, which significantly limits scalability and efficiency. In this paper, we leverage the inclusion relationships among query label sets to construct partial indexes, enabling index sharing across queries for improved construction efficiency. We introduce \textit{elastic factor} bounds to guarantee search performance and use the greedy algorithm to select indices that meet the bounds, achieving a tradeoff between efficiency and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Fuzzy Logic and Control Systems · Rough Sets and Fuzzy Logic
