Scalability and Total Recall with Fast CoveringLSH
Ninh Pham, Rasmus Pagh

TL;DR
This paper introduces Fast CoveringLSH, a practical and scalable locality-sensitive hashing scheme for high-dimensional Hamming space that guarantees no false negatives and improves computation time over previous methods.
Contribution
The paper presents Fast CoveringLSH, an efficient implementation of CoveringLSH that reduces hash computation time while maintaining zero false negatives in high-dimensional similarity search.
Findings
Fast CoveringLSH eliminates false negatives in high-dimensional Hamming space.
It achieves an asymptotic improvement in hash computation time from O(dL) to O(d + L log L).
Experiments show fcLSH performs comparably or better than traditional methods for search radii up to 20.
Abstract
Locality-sensitive hashing (LSH) has emerged as the dominant algorithmic technique for similarity search with strong performance guarantees in high-dimensional spaces. A drawback of traditional LSH schemes is that they may have \emph{false negatives}, i.e., the recall is less than 100\%. This limits the applicability of LSH in settings requiring precise performance guarantees. Building on the recent theoretical "CoveringLSH" construction that eliminates false negatives, we propose a fast and practical covering LSH scheme for Hamming space called \emph{Fast CoveringLSH (fcLSH)}. Inheriting the design benefits of CoveringLSH our method avoids false negatives and always reports all near neighbors. Compared to CoveringLSH we achieve an asymptotic improvement to the hash function computation time from to , where is the dimensionality of data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
