Hybrid LSH: Faster Near Neighbors Reporting in High-dimensional Space
Ninh Pham

TL;DR
This paper introduces a hybrid search method combining LSH and linear search for high-dimensional near neighbor problems, dynamically choosing the best strategy based on data distribution to improve efficiency.
Contribution
It proposes an auxiliary data structure that estimates search costs, enabling adaptive strategy selection in high-dimensional $r$-NN problems.
Findings
Hybrid approach outperforms pure LSH and linear search in experiments.
Efficient cost estimation guides strategy choice regardless of data distribution.
Method is compatible with recent LSH techniques.
Abstract
We study the -near neighbors reporting problem (-NN), i.e., reporting \emph{all} points in a high-dimensional point set that lie within a radius of a given query point . Our approach builds upon on the locality-sensitive hashing (LSH) framework due to its appealing asymptotic sublinear query time for near neighbor search problems in high-dimensional space. A bottleneck of the traditional LSH scheme for solving -NN is that its performance is sensitive to data and query-dependent parameters. On datasets whose data distributions have diverse local density patterns, LSH with inappropriate tuning parameters can sometimes be outperformed by a simple linear search. In this paper, we introduce a hybrid search strategy between LSH-based search and linear search for -NN in high-dimensional space. By integrating an auxiliary data structure into LSH hash tables, we can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Image Retrieval and Classification Techniques
