Dataset Condensation with Gradient Matching
Bo Zhao, Konda Reddy Mopuri, Hakan Bilen

TL;DR
This paper introduces Dataset Condensation, a method that synthesizes small, informative datasets by matching gradients, enabling efficient training of neural networks and outperforming existing techniques in vision tasks.
Contribution
It presents a novel gradient matching approach for dataset condensation, improving data efficiency and model training performance in vision benchmarks.
Findings
Significantly outperforms state-of-the-art methods in vision benchmarks.
Effective in continual learning and neural architecture search scenarios.
Reduces data storage and training costs while maintaining accuracy.
Abstract
As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive. This paper proposes a training set synthesis technique for data-efficient learning, called Dataset Condensation, that learns to condense large dataset into a small set of informative synthetic samples for training deep neural networks from scratch. We formulate this goal as a gradient matching problem between the gradients of deep neural network weights that are trained on the original and our synthetic data. We rigorously evaluate its performance in several computer vision benchmarks and demonstrate that it significantly outperforms the state-of-the-art methods. Finally we explore the use of our method in continual learning and neural architecture search and report promising gains when limited memory and computations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis · Anomaly Detection Techniques and Applications
