A reliable order-statistics-based approximate nearest neighbor search algorithm
Luisa Verdoliva, Davide Cozzolino, Giovanni Poggi

TL;DR
This paper introduces a novel approximate nearest neighbor search algorithm leveraging order-statistics and cone-based space partitioning, demonstrating state-of-the-art performance on various datasets.
Contribution
It presents a new order-statistics-based algorithm that effectively handles unstructured data by classifying vectors into cones, improving search efficiency.
Findings
Achieves state-of-the-art performance on real-world data
Handles unstructured data effectively
Provides a scalable and efficient search method
Abstract
We propose a new algorithm for fast approximate nearest neighbor search based on the properties of ordered vectors. Data vectors are classified based on the index and sign of their largest components, thereby partitioning the space in a number of cones centered in the origin. The query is itself classified, and the search starts from the selected cone and proceeds to neighboring ones. Overall, the proposed algorithm corresponds to locality sensitive hashing in the space of directions, with hashing based on the order of components. Thanks to the statistical features emerging through ordering, it deals very well with the challenging case of unstructured data, and is a valuable building block for more complex techniques dealing with structured data. Experiments on both simulated and real-world data prove the proposed algorithm to provide a state-of-the-art performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
