Proximity in the Age of Distraction: Robust Approximate Nearest Neighbor Search
Sariel Har-Peled, Sepideh Mahabadi

TL;DR
This paper introduces a robust approximate nearest neighbor search method that tolerates arbitrary corruption or unknown values in some dataset coordinates, providing a practical, polynomial-bounds solution for high-dimensional data.
Contribution
It presents a novel reduction technique to adapt existing ANN algorithms for robustness against coordinate corruption, with a sampling approach achieving bi-criterion approximation.
Findings
Achieves bi-criterion approximation for robust ANN.
Provides a simple, practical data-structure with polynomial bounds.
Extends to various applications and improvements.
Abstract
We introduce a new variant of the nearest neighbor search problem, which allows for some coordinates of the dataset to be arbitrarily corrupted or unknown. Formally, given a dataset of points in high-dimensions, and a parameter , the goal is to preprocess the dataset, such that given a query point , one can compute quickly a point , such that the distance of the query to the point is minimized, when ignoring the "optimal" coordinates. Note, that the coordinates being ignored are a function of both the query point and the point returned. We present a general reduction from this problem to answering ANN queries, which is similar in spirit to LSH (locality sensitive hashing) [IM98]. Specifically, we give a sampling technique which achieves a bi-criterion approximation for this problem. If the distance to the nearest neighbor after…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Optimization and Search Problems · Algorithms and Data Compression
