Kernel Density Estimation through Density Constrained Near Neighbor Search
Moses Charikar, Michael Kapralov, Navid Nouri, Paris Siminelakis

TL;DR
This paper introduces a new data structure for efficient kernel density estimation in high-dimensional spaces, improving query time and space complexity by leveraging geometric structure and data-dependent near neighbor search techniques.
Contribution
It presents a novel implementation of importance sampling combined with geometric analysis to enhance kernel density estimation efficiency, especially for Gaussian kernels.
Findings
Achieves near-optimal query time and space complexity for radial kernels.
Uses geometric structure of sampled datasets to improve near neighbor search efficiency.
Provides data-dependent bounds that outperform previous worst-case analyses.
Abstract
In this paper we revisit the kernel density estimation problem: given a kernel and a dataset of points in high dimensional Euclidean space, prepare a data structure that can quickly output, given a query , a -approximation to . First, we give a single data structure based on classical near neighbor search techniques that improves upon or essentially matches the query time and space complexity for all radial kernels considered in the literature so far. We then show how to improve both the query complexity and runtime by using recent advances in data-dependent near neighbor search. We achieve our results by giving a new implementation of the natural importance sampling scheme. Unlike previous approaches, our algorithm first samples the dataset uniformly (considering a geometric sequence of sampling rates), and then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
