Dataset Condensation via Efficient Synthetic-Data Parameterization
Jang-Hyun Kim, Jinuk Kim, Seong Joon Oh, Sangdoo Yun, Hwanjun Song,, Joonhyun Jeong, Jung-Woo Ha, Hyun Oh Song

TL;DR
This paper introduces a new dataset condensation method that efficiently synthesizes compact training data by leveraging data regularity and improved optimization, significantly enhancing data quality for machine learning tasks.
Contribution
It presents a novel condensation framework with efficient parameterization and an improved optimization technique, outperforming existing methods on multiple datasets.
Findings
Significantly better data quality on CIFAR-10, ImageNet, and Speech Commands.
Effective optimization addressing limitations of gradient matching methods.
Reduced storage requirements with high-fidelity synthetic datasets.
Abstract
The great success of machine learning with massive amounts of data comes at a price of huge computation costs and storage for training and tuning. Recent studies on dataset condensation attempt to reduce the dependence on such massive data by synthesizing a compact training dataset. However, the existing approaches have fundamental limitations in optimization due to the limited representability of synthetic datasets without considering any data regularity characteristics. To this end, we propose a novel condensation framework that generates multiple synthetic data with a limited storage budget via efficient parameterization considering data regularity. We further analyze the shortcomings of the existing gradient matching-based condensation methods and develop an effective optimization technique for improving the condensation of training data information. We propose a unified algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Speech Recognition and Synthesis
