Unsupervised Space Partitioning for Nearest Neighbor Search
Abrar Fahim, Mohammed Eunus Ali, Muhammad Aamir Cheema

TL;DR
This paper introduces an unsupervised, end-to-end learning framework for space partitioning in Approximate Nearest Neighbor Search that outperforms existing methods without requiring dataset pre-processing.
Contribution
It presents a novel unsupervised multi-objective loss function and ensembling technique for space partitioning, improving ANNS performance and efficiency over traditional clustering methods.
Findings
Outperforms state-of-the-art partitioning methods on benchmarks.
Requires fewer parameters and shorter training times.
Enhances existing ANNS techniques like ScaNN significantly.
Abstract
Approximate Nearest Neighbor Search (ANNS) in high dimensional spaces is crucial for many real-life applications (e.g., e-commerce, web, multimedia, etc.) dealing with an abundance of data. This paper proposes an end-to-end learning framework that couples the partitioning (one critical step of ANNS) and learning-to-search steps using a custom loss function. A key advantage of our proposed solution is that it does not require any expensive pre-processing of the dataset, which is one of the critical limitations of the state-of-the-art approach. We achieve the above edge by formulating a multi-objective custom loss function that does not need ground truth labels to quantify the quality of a given data-space partition, making it entirely unsupervised. We also propose an ensembling technique by adding varying input weights to the loss function to train an ensemble of models to enhance the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Data Management and Algorithms
Methodsk-Means Clustering
