Accelerating Dataset Distillation via Model Augmentation

Lei Zhang; Jie Zhang; Bowen Lei; Subhabrata Mukherjee; Xiang Pan; Bo; Zhao; Caiwen Ding; Yao Li; Dongkuan Xu

arXiv:2212.06152·cs.LG·April 18, 2023·1 cites

Accelerating Dataset Distillation via Model Augmentation

Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo, Zhao, Caiwen Ding, Yao Li, Dongkuan Xu

PDF

Open Access 2 Repos

TL;DR

This paper introduces model augmentation techniques to accelerate dataset distillation, significantly reducing training time while maintaining high performance.

Contribution

It proposes using early-stage models and parameter perturbation to improve dataset distillation efficiency and effectiveness.

Findings

01

Achieves up to 20x speedup in dataset distillation

02

Maintains comparable performance to state-of-the-art methods

03

Reduces computational cost significantly

Abstract

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones. Existing DD methods based on gradient matching achieve leading performance; however, they are extremely computationally intensive as they require continuously optimizing a dataset among thousands of randomly initialized models. In this paper, we assume that training the synthetic data with diverse models leads to better generalization performance. Thus we propose two model augmentation techniques, i.e. using early-stage models and parameter perturbation to learn an informative synthetic set with significantly reduced training cost. Extensive experiments demonstrate that our method achieves up to 20x speedup and comparable performance on par with state-of-the-art methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification