Approximating Nearest Neighbor Distances
Michael B. Cohen, Brittany Terese Fasy, Gary L. Miller, Amir, Nayyeri, Donald R. Sheehy, Ameya Velingker

TL;DR
This paper introduces efficient approximation algorithms for the nearest neighbor metric, a non-Euclidean measure that emphasizes proximity to input points, aiding clustering of noisy data in Euclidean space.
Contribution
It presents a $(3+ ext{epsilon})$-approximation and a $(1+ ext{epsilon})$-approximation algorithm for computing the nearest neighbor metric in near-linear time.
Findings
Both algorithms run in near-linear time.
The $(3+ ext{epsilon})$-approximation uses shortest paths on a sparse graph.
The $(1+ ext{epsilon})$-approximation employs a sparse sampling of the ambient space.
Abstract
Several researchers proposed using non-Euclidean metrics on point sets in Euclidean space for clustering noisy data. Almost always, a distance function is desired that recognizes the closeness of the points in the same cluster, even if the Euclidean cluster diameter is large. Therefore, it is preferred to assign smaller costs to the paths that stay close to the input points. In this paper, we consider the most natural metric with this property, which we call the nearest neighbor metric. Given a point set P and a path , our metric charges each point of with its distance to P. The total charge along determines its nearest neighbor length, which is formally defined as the integral of the distance to the input points along the curve. We describe a -approximation algorithm and a -approximation algorithm to compute the nearest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
