Kernel Thinning

Raaz Dwivedi; Lester Mackey

arXiv:2105.05842·stat.ML·May 14, 2024

Kernel Thinning

Raaz Dwivedi, Lester Mackey

PDF

Open Access 1 Repo

TL;DR

Kernel thinning is a novel distribution compression method that significantly reduces sample size while maintaining integration accuracy, outperforming i.i.d. sampling and standard thinning, with broad applicability to various kernels and distributions.

Contribution

We introduce kernel thinning, a new procedure that compresses distribution approximations efficiently, providing theoretical guarantees and practical benefits over existing sampling methods.

Findings

01

Achieves $O(n^{-1/2})$ integration error for compactly supported distributions.

02

Outperforms i.i.d. sampling with $ ext{Omega}(n^{-1/4})$ error.

03

Provides near-optimal $L^ ext{infinity}$ coresets in quadratic time.

Abstract

We introduce kernel thinning, a new procedure for compressing a distribution $P$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $k_{⋆}$ and $O (n^{2})$ time, kernel thinning compresses an $n$ -point approximation to $P$ into a $n$ -point approximation with comparable worst-case integration error across the associated reproducing kernel Hilbert space. The maximum discrepancy in integration error is $O_{d} (n^{- 1/2} lo g n)$ in probability for compactly supported $P$ and $O_{d} (n^{- \frac{1}{2}} (lo g n)^{(d + 1) /2} lo g lo g n)$ for sub-exponential $P$ on $R^{d}$ . In contrast, an equal-sized i.i.d. sample from $P$ suffers $Ω (n^{- 1/4})$ integration error. Our sub-exponential guarantees resemble the classical quasi-Monte Carlo error rates for uniform $P$ on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/goodpoints
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical Approximation and Integration · Statistical Methods and Inference