TL;DR
This paper surveys permutation-based approximate nearest neighbor search methods, evaluates their effectiveness against state-of-the-art techniques, and identifies scenarios where they are most beneficial, especially in high-accuracy, in-memory retrieval tasks.
Contribution
It provides an extensive experimental comparison of permutation methods with leading benchmarks and discusses their efficiency and practical utility in large-scale, high-accuracy retrieval.
Findings
Permutation methods are reasonably efficient for in-memory high-accuracy retrieval.
They perform competitively against multi-probe LSH, VP-trees, and proximity graphs.
A specific setup enhances their usefulness in practical applications.
Abstract
We survey permutation-based methods for approximate k-nearest neighbor search. In these methods, every data point is represented by a ranked list of pivots sorted by the distance to this point. Such ranked lists are called permutations. The underpinning assumption is that, for both metric and non-metric spaces, the distance between permutations is a good proxy for the distance between original points. Thus, it should be possible to efficiently retrieve most true nearest neighbors by examining only a tiny subset of data points whose permutations are similar to the permutation of a query. We further test this assumption by carrying out an extensive experimental evaluation where permutation methods are pitted against state-of-the art benchmarks (the multi-probe LSH, the VP-tree, and proximity-graph based retrieval) on a variety of realistically large data set from the image and textual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
