Permutation Search Methods are Efficient, Yet Faster Search is Possible

Bilegsaikhan Naidan; Leonid Boytsov; Eric Nyberg

arXiv:1506.03163·cs.LG·November 1, 2016

Permutation Search Methods are Efficient, Yet Faster Search is Possible

Bilegsaikhan Naidan, Leonid Boytsov, Eric Nyberg

PDF

1 Repo

TL;DR

This paper surveys permutation-based approximate nearest neighbor search methods, evaluates their effectiveness against state-of-the-art techniques, and identifies scenarios where they are most beneficial, especially in high-accuracy, in-memory retrieval tasks.

Contribution

It provides an extensive experimental comparison of permutation methods with leading benchmarks and discusses their efficiency and practical utility in large-scale, high-accuracy retrieval.

Findings

01

Permutation methods are reasonably efficient for in-memory high-accuracy retrieval.

02

They perform competitively against multi-probe LSH, VP-trees, and proximity graphs.

03

A specific setup enhances their usefulness in practical applications.

Abstract

We survey permutation-based methods for approximate k-nearest neighbor search. In these methods, every data point is represented by a ranked list of pivots sorted by the distance to this point. Such ranked lists are called permutations. The underpinning assumption is that, for both metric and non-metric spaces, the distance between permutations is a good proxy for the distance between original points. Thus, it should be possible to efficiently retrieve most true nearest neighbors by examining only a tiny subset of data points whose permutations are similar to the permutation of a query. We further test this assumption by carrying out an extensive experimental evaluation where permutation methods are pitted against state-of-the art benchmarks (the multi-probe LSH, the VP-tree, and proximity-graph based retrieval) on a variety of realistically large data set from the image and textual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

searchivarius/nmslib
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.