Experimental Analysis of Locality Sensitive Hashing Techniques for High-Dimensional Approximate Nearest Neighbor Searches
Omid Jafari, Parth Nagarkar

TL;DR
This paper experimentally compares the performance of different Locality Sensitive Hashing techniques for high-dimensional approximate nearest neighbor searches, focusing on algorithm time and index I/O costs.
Contribution
It provides a comprehensive experimental analysis of LSH techniques, highlighting that C2LSH remains the most effective in terms of performance and accuracy.
Findings
C2LSH outperforms recent competitors in speed and accuracy
Both algorithm time and index I/O are critical for LSH performance
C2LSH maintains state-of-the-art performance across datasets
Abstract
Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many multimedia retrieval applications. Exact tree-based indexing approaches are known to suffer from the notorious curse of dimensionality for high-dimensional data. Approximate searching techniques sacrifice some accuracy while returning good enough results for faster performance. Locality Sensitive Hashing (LSH) is a very popular technique for finding approximate nearest neighbors in high-dimensional spaces. Apart from providing theoretical guarantees on the query results, one of the main benefits of LSH techniques is their good scalability to large datasets because they are external memory based. The most dominant costs for existing LSH techniques are the algorithm time and the index I/Os required to find candidate points. Existing works do not compare both of these dominant costs in their evaluation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
