Optimal Hashing-based Time-Space Trade-offs for Approximate Near   Neighbors

Alexandr Andoni; Thijs Laarhoven; Ilya Razenshteyn; Erik; Waingarten

arXiv:1608.03580·cs.DS·October 4, 2019

Optimal Hashing-based Time-Space Trade-offs for Approximate Near Neighbors

Alexandr Andoni, Thijs Laarhoven, Ilya Razenshteyn, Erik, Waingarten

PDF

TL;DR

This paper establishes tight upper and lower bounds for time-space trade-offs in approximate near neighbor search in high-dimensional Euclidean spaces, achieving sublinear query time with near-linear space for all approximation factors greater than one.

Contribution

It introduces a new data structure that optimally balances space and query time for approximate near neighbor search, and proves matching lower bounds, including the first non-polynomial space lower bound for two probes.

Findings

01

Achieves sublinear query time with near-linear space for all c > 1

02

Provides tight upper and lower bounds for the problem

03

Establishes a connection to locally-decodable codes for lower bounds

Abstract

[See the paper for the full abstract.] We show tight upper and lower bounds for time-space trade-offs for the $c$ -Approximate Near Neighbor Search problem. For the $d$ -dimensional Euclidean space and $n$ -point datasets, we develop a data structure with space $n^{1 + ρ_{u} + o (1)} + O (d n)$ and query time $n^{ρ_{q} + o (1)} + d n^{o (1)}$ for every $ρ_{u}, ρ_{q} \geq 0$ such that: \begin{equation} c^2 \sqrt{\rho_q} + (c^2 - 1) \sqrt{\rho_u} = \sqrt{2c^2 - 1}. \end{equation} This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor $c > 1$ , improving upon [Kapralov, PODS 2015]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on Spherical Locality-Sensitive Filtering [Becker, Ducas, Gama, Laarhoven, SODA 2016] and data-dependent hashing [Andoni, Indyk, Nguyen,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.