Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality
Xuxi Chen, Yu Yang, Zhangyang Wang, Baharan Mirzasoleiman

TL;DR
This paper introduces Progressive Dataset Distillation (PDD), a method that synthesizes multiple datasets to better capture training dynamics, significantly improving distillation performance and enabling larger synthetic datasets.
Contribution
The paper proposes PDD, a novel approach that uses multiple synthetic datasets conditioned on previous ones to enhance dataset distillation performance.
Findings
PDD improves existing distillation methods by up to 4.3%.
It enables generation of larger synthetic datasets.
PDD captures training dynamics at different phases.
Abstract
Dataset distillation aims to minimize the time and memory needed for training deep networks on large datasets, by creating a small set of synthetic images that has a similar generalization performance to that of the full dataset. However, current dataset distillation techniques fall short, showing a notable performance gap when compared to training on the original data. In this work, we are the first to argue that using just one synthetic subset for distillation will not yield optimal generalization performance. This is because the training dynamics of deep networks drastically change during the training. Hence, multiple synthetic subsets are required to capture the training dynamics at different phases of training. To address this issue, we propose Progressive Dataset Distillation (PDD). PDD synthesizes multiple small sets of synthetic images, each conditioned on the previous sets, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
