Approximate Diverse $k$-nearest Neighbor Search in Vector Database
Jiachen Zhao, Xiao Yan, Eric Lo

TL;DR
This paper introduces a new progressive search framework for approximate diverse $k$-nearest neighbor search in vector databases, improving result diversity and approximation quality without extra indexing overhead.
Contribution
It presents a novel integration of diversification into state-of-the-art A$k$-NNS methods through a progressive search framework with iterative, diversification, and verification phases.
Findings
Consistently retrieves near-optimal diverse results
Achieves minimal latency overhead
Performs well across large-scale datasets
Abstract
Approximate -nearest neighbor search (A-NNS) is a core operation in vector databases, underpinning applications such as retrieval-augmented generation (RAG) and image retrieval. In these scenarios, users often prefer diverse result sets to minimize redundancy and enhance information value. However, existing greedy-based diverse methods frequently yield sub-optimal results, failing to adequately approximate the optimal similarity score under certain diversification level. Furthermore, there is a need for flexible algorithms that can adapt to varying user-defined result sizes and diversity requirements. To address these challenges, we propose a novel approach that seamlessly integrates result diversification into state-of-the-art (SOTA) A-NNS methods. Our approach introduces a progressive search framework, consisting of iterative searching, diversification, and verification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Image Retrieval and Classification Techniques
