From Average Embeddings To Nearest Neighbor Search

Alexandr Andoni; David Cheikhi

arXiv:2105.05761·cs.DS·May 13, 2021

From Average Embeddings To Nearest Neighbor Search

Alexandr Andoni, David Cheikhi

PDF

Open Access

TL;DR

This paper demonstrates how average embeddings can be used to develop efficient approximate nearest neighbor search algorithms, extending classic embedding methods with a data-dependent hashing approach.

Contribution

It introduces a novel approach leveraging average embeddings to improve approximate nearest neighbor search, strengthening traditional bi-Lipschitz embedding techniques.

Findings

01

Embedding metric spaces into on average enables efficient approximate nearest neighbor search.

02

Existence of efficient average embeddings implies a polynomial approximation algorithm.

03

The approach enhances data-dependent hashing methods for similarity search.

Abstract

In this note, we show that one can use average embeddings, introduced recently in [Naor'20, arXiv:1905.01280], to obtain efficient algorithms for approximate nearest neighbor search. In particular, a metric $X$ embeds into $ℓ_{2}$ on average, with distortion $D$ , if, for any distribution $μ$ on $X$ , the embedding is $D$ Lipschitz and the (square of) distance does not decrease on average (wrt $μ$ ). In particular existence of such an embedding (assuming it is efficient) implies a $O (D^{3})$ approximate nearest neighbor search under $X$ . This can be seen as a strengthening of the classic (bi-Lipschitz) embedding approach to nearest neighbor search, and is another application of data-dependent hashing paradigm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and Algorithms · Optimization and Search Problems