Randomized embeddings with slack, and high-dimensional Approximate Nearest Neighbor
Evangelos Anagnostopoulos, Ioannis Z. Emiris, and Ioannis Psarros

TL;DR
This paper introduces a novel approach for approximate nearest neighbor search in high-dimensional Euclidean spaces using randomized embeddings and slack, achieving improved query times and space efficiency compared to existing methods like LSH.
Contribution
It proposes a new low-quality embedding framework and a randomized projection technique that reduces dimensionality, leading to more efficient ANN algorithms with better theoretical and practical performance.
Findings
Achieves faster query times than BBD-tree based methods.
Provides linear space data structures for approximate near neighbor search.
Experimental results outperform theoretical predictions up to 500 dimensions.
Abstract
The approximate nearest neighbor problem (-ANN) in high dimensional Euclidean space has been mainly addressed by Locality Sensitive Hashing (LSH), which has polynomial dependence in the dimension, sublinear query time, but subquadratic space requirement. In this paper, we introduce a new definition of "low-quality" embeddings for metric spaces. It requires that, for some query point , there exists an approximate nearest neighbor among the pre-images of the approximate nearest neighbors in the target space. Focusing on Euclidean spaces, we employ random projections in order to reduce the original problem to one in a space of dimension inversely proportional to . The approximate nearest neighbors can be efficiently retrieved by a data structure such as BBD-trees. The same approach is applied to the problem of computing an approximate near neighbor, where we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
