Improved Space-Efficient Approximate Nearest Neighbor Search Using Function Inversion
Samuel McCauley

TL;DR
This paper introduces a novel approach using function inversion to enhance the space efficiency of locality-sensitive hashing (LSH) for approximate nearest neighbor search, reducing space requirements and improving query times in high-dimensional data.
Contribution
It presents a new method leveraging function inversion to simplify and improve LSH-based ANN data structures, particularly enhancing the ALRW structure for Euclidean distance.
Findings
Reduces space usage of LSH-based ANN data structures.
Improves query times for Euclidean ANN.
Shows list-of-points structures are not optimal for Euclidean or Manhattan ANN.
Abstract
Approximate nearest neighbor search (ANN) data structures have widespread applications in machine learning, computational biology, and text processing. The goal of ANN is to preprocess a set S so that, given a query q, we can find a point y whose distance from q approximates the smallest distance from q to any point in S. For most distance functions, the best-known ANN bounds for high-dimensional point sets are obtained using techniques based on locality-sensitive hashing (LSH). Unfortunately, space efficiency is a major challenge for LSH-based data structures. Classic LSH techniques require a very large amount of space, oftentimes polynomial in |S|. A long line of work has developed intricate techniques to reduce this space usage, but these techniques suffer from downsides: they must be hand tailored to each specific LSH, are often complicated, and their space reduction comes at the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
