Learning Nearest Neighbor Graphs from Noisy Distance Samples
Blake Mason, Ardhendu Tripathy, and Robert Nowak

TL;DR
This paper introduces an active algorithm for learning nearest neighbor graphs from noisy distance samples, applicable to general metrics and effective in practical crowdsourced preference data.
Contribution
It presents a novel, noise-tolerant method for graph learning that does not require Euclidean assumptions, with proven theoretical efficiency and empirical validation.
Findings
Requires O(n log n) queries in favorable conditions
Outperforms baseline methods in experiments
Effective on crowdsourced shoe similarity data
Abstract
We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Delta^-2) queries in favorable settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Algorithms and Data Compression · Data Management and Algorithms
