Learning Cluster Representatives for Approximate Nearest Neighbor Search
Thomas Vecchiato

TL;DR
This paper introduces a novel learning-to-rank approach for clustering-based approximate nearest neighbor search, significantly improving accuracy by learning cluster representatives through a simple linear function.
Contribution
It presents a new method that leverages learning-to-rank for optimizing cluster representatives, enhancing the efficiency and accuracy of approximate nearest neighbor search.
Findings
Learning cluster representatives with a linear function improves search accuracy.
The method effectively reduces search space in high-dimensional data.
Demonstrates state-of-the-art performance in maximum inner product search.
Abstract
Developing increasingly efficient and accurate algorithms for approximate nearest neighbor search is a paramount goal in modern information retrieval. A primary approach to addressing this question is clustering, which involves partitioning the dataset into distinct groups, with each group characterized by a representative data point. By this method, retrieving the top-k data points for a query requires identifying the most relevant clusters based on their representatives -- a routing step -- and then conducting a nearest neighbor search within these clusters only, drastically reducing the search space. The objective of this thesis is not only to provide a comprehensive explanation of clustering-based approximate nearest neighbor search but also to introduce and delve into every aspect of our novel state-of-the-art method, which originated from a natural observation: The routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Face and Expression Recognition
