Fast Nearest Neighbor Preserving Embeddings

Johan Sivertsen

arXiv:1707.06867·cs.DS·July 24, 2017

Fast Nearest Neighbor Preserving Embeddings

Johan Sivertsen

PDF

Open Access

TL;DR

This paper introduces a sparse, randomized embedding method that preserves approximate nearest neighbors in high-dimensional data, reducing dimensionality based on dataset complexity and improving search efficiency.

Contribution

It presents a novel analog to the Fast Johnson-Lindenstrauss Transform tailored for nearest neighbor preservation, with dimensionality bounds tied to dataset doubling dimension.

Findings

01

Reduces embedding dimension for real-world datasets

02

Speeds up approximate nearest neighbor searches

03

Embeddings are sparse and computationally efficient

Abstract

We show an analog to the Fast Johnson-Lindenstrauss Transform for Nearest Neighbor Preserving Embeddings in $ℓ_{2}$ . These are sparse, randomized embeddings that preserve the (approximate) nearest neighbors. The dimensionality of the embedding space is bounded not by the size of the embedded set n, but by its doubling dimension {\lambda}. For most large real-world datasets this will mean a considerably lower-dimensional embedding space than possible when preserving all distances. The resulting embeddings can be used with existing approximate nearest neighbor data structures to yield speed improvements.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Face and Expression Recognition · Stochastic Gradient Optimization Techniques