Improved Coresets for Kernel Density Estimates
Jeff M. Phillips, Wai Ming Tai

TL;DR
This paper develops improved methods for constructing small coresets that accurately approximate kernel density estimates, with tight bounds and analysis applicable across various kernels and dimensions, enhancing efficiency in statistical and geometric applications.
Contribution
It introduces new bounds and analysis for coresets approximating kernel density estimates, especially for characteristic kernels, and refines bounds for Gaussian kernels in fixed dimensions.
Findings
Coreset size is $2/\epsilon^2$ for characteristic kernels, independent of data dimension.
Tighter bounds are achieved for Gaussian kernels when the dimension is constant.
The analysis unifies approaches from statistics, machine learning, and geometry.
Abstract
We study the construction of coresets for kernel density estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic kernels (including Gaussian and Laplace kernels), our approximation preserves the error between kernel density estimates within error , with coreset size , but no other aspects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unrestricted, we show this bound is tight for these kernels as well as a much broader set. This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called \emph{kernel herding}. This analysis unites a broad line of work that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsCoresets
