
TL;DR
This paper investigates the construction of small epsilon-samples for kernel density estimates, demonstrating that smoother range spaces with kernels like Gaussian can significantly reduce sample sizes needed for accurate approximation.
Contribution
It introduces the concept of epsilon-samples for kernel density estimates with continuous range spaces, providing improved bounds on sample sizes and discrepancy for kernels such as Gaussian.
Findings
Sample size for Gaussian kernels is O((1/eps) sqrt{log (1/eps)}) in the plane.
Discrepancy for these kernels is O(sqrt{log n}), better than for balls.
Bounds are derived using VC-dimension and discrepancy analysis.
Abstract
We study the worst case error of kernel density estimates via subset approximation. A kernel density estimate of a distribution is the convolution of that distribution with a fixed kernel (e.g. Gaussian kernel). Given a subset (i.e. a point set) of the input distribution, we can compare the kernel density estimates of the input distribution with that of the subset and bound the worst case error. If the maximum error is eps, then this subset can be thought of as an eps-sample (aka an eps-approximation) of the range space defined with the input distribution as the ground set and the fixed kernel representing the family of ranges. Interestingly, in this case the ranges are not binary, but have a continuous range (for simplicity we focus on kernels with range of [0,1]); these allow for smoother notions of range spaces. It turns out, the use of this smoother family of range spaces has an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Approximation and Integration · Machine Learning and Algorithms · Image and Object Detection Techniques
