On the Size and Approximation Error of Distilled Sets
Alaa Maalouf, Murad Tukan, Noel Loo, Ramin Hasani, Mathias, Lechner, Daniela Rus

TL;DR
This paper provides a theoretical analysis of dataset distillation for kernel ridge regression, proving the existence of small distilled datasets with quantifiable excess risk, and establishing bounds on their approximation error.
Contribution
It offers the first theoretical guarantees on the size and error bounds of distilled datasets in kernel ridge regression using random Fourier features.
Findings
Small distilled datasets exist with size linear in RFF dimension or effective degrees of freedom.
The excess risk of distilled datasets can be bounded and depends on regularization.
Empirical verification supports the theoretical bounds.
Abstract
Dataset Distillation is the task of synthesizing small datasets from large ones while still retaining comparable predictive accuracy to the original uncompressed dataset. Despite significant empirical progress in recent years, there is little understanding of the theoretical limitations/guarantees of dataset distillation, specifically, what excess risk is achieved by distillation compared to the original dataset, and how large are distilled datasets? In this work, we take a theoretical view on kernel ridge regression (KRR) based methods of dataset distillation such as Kernel Inducing Points. By transforming ridge regression in random Fourier features (RFF) space, we provide the first proof of the existence of small (size) distilled datasets and their corresponding excess risk for shift-invariant kernels. We prove that a small set of instances exists in the original input space such that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques
