Down the Rabbit Hole: Robust Proximity Search and Density Estimation in Sublinear Space
Sariel Har-Peled, Nirman Kumar

TL;DR
This paper introduces a sublinear space data structure for approximate nearest neighbor queries and density estimation in high-dimensional data, enabling efficient proximity searches and density calculations with minimal storage.
Contribution
The work presents a novel sublinear space data structure for $(1+ ext{eps},k)$-ANN queries and density estimation, extending geometric data summarization techniques.
Findings
Achieves logarithmic query time for proximity searches.
Uses space complexity of approximately O(n/k), sublinear in input size.
Provides sampling methods with linear dependency on dimension for density estimation.
Abstract
For a set of points in , and parameters and , we present a data structure that answers -\ANN queries in logarithmic time. Surprisingly, the space used by the data-structure is ; that is, the space used is sublinear in the input size if is sufficiently large. Our approach provides a novel way to summarize geometric data, such that meaningful proximity queries on the data can be carried out using this sketch. Using this, we provide a sublinear space data-structure that can estimate the density of a point set under various measures, including: \begin{inparaenum}[(i)] \item sum of distances of closest points to the query point, and \item sum of squared distances of closest points to the query point. \end{inparaenum} Our approach generalizes to other distance based estimation of densities of similar flavor. We also study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Geometry and Mesh Generation · Data Management and Algorithms · Advanced Image and Video Retrieval Techniques
