Compressed Empirical Measures (in finite dimensions)

Steffen Gr\"unew\"alder

arXiv:2204.08847·stat.ML·August 29, 2024

Compressed Empirical Measures (in finite dimensions)

Steffen Gr\"unew\"alder

PDF

Open Access

TL;DR

This paper investigates methods for compressing empirical measures in finite-dimensional RKHSs, deriving bounds on coreset sizes, and analyzing the impact of data and kernel properties on compression quality.

Contribution

It introduces new lower bounds on coreset sizes based on data density, kernel properties, and covariance eigenvalues, and extends these results to kernel ridge regression and algorithmic guarantees.

Findings

01

Lower bounds on coreset sizes depend on data density and kernel conditions.

02

Eigenvalue bounds of covariance operators inform compression limits.

03

Standard algorithms like conditional gradient have quantifiable compression guarantees.

Abstract

We study approaches for compressing the empirical measure in the context of finite dimensional reproducing kernel Hilbert spaces (RKHSs). In this context, the empirical measure is contained within a natural convex set and can be approximated using convex optimization methods. Such an approximation gives rise to a coreset of data points. A key quantity that controls how large such a coreset has to be is the size of the largest ball around the empirical measure that is contained within the empirical convex set. The bulk of our work is concerned with deriving high probability lower bounds on the size of such a ball under various conditions and in various settings: we show how conditions on the density of the data and the kernel function can be used to infer such lower bounds; we further develop an approach that uses a lower bound on the smallest eigenvalue of a covariance operator to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Machine Learning and Algorithms