Tight Lower Bounds for Data-Dependent Locality-Sensitive Hashing
Alexandr Andoni, Ilya Razenshteyn

TL;DR
This paper establishes a tight lower bound on the exponent for data-dependent Locality-Sensitive Hashing schemes, showing that even data-dependent methods cannot surpass certain theoretical limits in approximate nearest neighbor search.
Contribution
It proves a lower bound on the exponent for data-dependent LSH, matching the known upper bounds and formalizing the complexity constraints of hash functions.
Findings
Lower bound matches the upper bound for $ ho$ in $ ext{ell}_1$ space.
Data-dependent hashing cannot outperform the bound $ ho o 1/(2c-1)$.
Formalizes the notion of succinctness for hash functions to exclude infeasible solutions.
Abstract
We prove a tight lower bound for the exponent for data-dependent Locality-Sensitive Hashing schemes, recently used to design efficient solutions for the -approximate nearest neighbor search. In particular, our lower bound matches the bound of for the space, obtained via the recent algorithm from [Andoni-Razenshteyn, STOC'15]. In recent years it emerged that data-dependent hashing is strictly superior to the classical Locality-Sensitive Hashing, when the hash function is data-independent. In the latter setting, the best exponent has been already known: for the space, the tight bound is , with the upper bound from [Indyk-Motwani, STOC'98] and the matching lower bound from [O'Donnell-Wu-Zhou, ITCS'11]. We prove that, even if the hashing is data-dependent, it must hold that . To prove the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
