Discovering Data Structures: Nearest Neighbor Search and Beyond
Omar Salemohamed, Laurent Charlin, Shivam Garg, Vatsal Sharan, Gregory, Valiant

TL;DR
This paper introduces a flexible, end-to-end learning framework for data structures that adapts to data distributions, enabling the discovery of optimal algorithms for nearest neighbor search and other problems without prior initialization.
Contribution
The authors present a novel framework for learning data structures from scratch, capable of discovering algorithms like binary search, k-d trees, and LSH variants, tailored to specific data distributions.
Findings
Discovered optimal algorithms for 1D nearest neighbor search.
Learned data structures resemble classical methods like k-d trees and LSH.
Framework applicable to frequency estimation and potentially other problems.
Abstract
We propose a general framework for end-to-end learning of data structures. Our framework adapts to the underlying data distribution and provides fine-grained control over query and space complexity. Crucially, the data structure is learned from scratch, and does not require careful initialization or seeding with candidate data structures/algorithms. We first apply this framework to the problem of nearest neighbor search. In several settings, we are able to reverse-engineer the learned data structures and query algorithms. For 1D nearest neighbor search, the model discovers optimal distribution (in)dependent algorithms such as binary search and variants of interpolation search. In higher dimensions, the model learns solutions that resemble k-d trees in some regimes, while in others, they have elements of locality-sensitive hashing. The model can also learn useful representations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Advanced Database Systems and Queries
