On fast bounded locality sensitive hashing
Piotr Wygocki

TL;DR
This paper analyzes scalar product hash functions with bounded random vectors, deriving anti-concentration bounds that lead to improved high-dimensional approximate nearest neighbor algorithms with no false negatives.
Contribution
It introduces optimal anti-concentration bounds for bounded distributions, enhancing c-approximate nearest neighbor search without false negatives in high dimensions.
Findings
Optimal anti-concentration bounds for bounded vectors in $l_$ and $l_2$ spaces.
Improved algorithms for high-dimensional $l_p$ nearest neighbor search without false negatives.
Progress on the open problem of false negative-free nearest neighbor search for Hamming distance.
Abstract
In this paper, we examine the hash functions expressed as scalar products, i.e., , for some bounded random vector . Such hash functions have numerous applications, but often there is a need to optimize the choice of the distribution of . In the present work, we focus on so-called anti-concentration bounds, i.e. the upper bounds of . In many applications, is a vector of independent random variables with standard normal distribution. In such case, the distribution of is also normal and it is easy to approximate . Here, we consider two bounded distributions in the context of the anti-concentration bounds. Particularly, we analyze being a random vector from the unit ball in and being a random vector from the unit sphere in . We show optimal up to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Optimization and Search Problems · Robotics and Sensor-Based Localization
