Experimental Analysis of Locality Sensitive Hashing Techniques for   High-Dimensional Approximate Nearest Neighbor Searches

Omid Jafari; Parth Nagarkar

arXiv:2006.11285·cs.DB·February 16, 2021

Experimental Analysis of Locality Sensitive Hashing Techniques for High-Dimensional Approximate Nearest Neighbor Searches

Omid Jafari, Parth Nagarkar

PDF

TL;DR

This paper experimentally compares the performance of different Locality Sensitive Hashing techniques for high-dimensional approximate nearest neighbor searches, focusing on algorithm time and index I/O costs.

Contribution

It provides a comprehensive experimental analysis of LSH techniques, highlighting that C2LSH remains the most effective in terms of performance and accuracy.

Findings

01

C2LSH outperforms recent competitors in speed and accuracy

02

Both algorithm time and index I/O are critical for LSH performance

03

C2LSH maintains state-of-the-art performance across datasets

Abstract

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many multimedia retrieval applications. Exact tree-based indexing approaches are known to suffer from the notorious curse of dimensionality for high-dimensional data. Approximate searching techniques sacrifice some accuracy while returning good enough results for faster performance. Locality Sensitive Hashing (LSH) is a very popular technique for finding approximate nearest neighbors in high-dimensional spaces. Apart from providing theoretical guarantees on the query results, one of the main benefits of LSH techniques is their good scalability to large datasets because they are external memory based. The most dominant costs for existing LSH techniques are the algorithm time and the index I/Os required to find candidate points. Existing works do not compare both of these dominant costs in their evaluation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.