Dataset Condensation with Differentiable Siamese Augmentation
Bo Zhao, Hakan Bilen

TL;DR
This paper introduces Differentiable Siamese Augmentation, a novel method for condensing large datasets into smaller synthetic sets that retain training effectiveness, significantly reducing data size while maintaining high performance.
Contribution
The paper proposes a new data augmentation technique for dataset condensation that improves synthetic data quality and training performance, outperforming previous methods on multiple benchmarks.
Findings
Achieves 7% performance improvement on CIFAR datasets.
Reaches near-complete performance with less than 1% data on several benchmarks.
Shows promising results in continual learning and neural architecture search.
Abstract
In many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load. In this paper, we focus on condensing large training sets into significantly smaller synthetic sets which can be used to train deep neural networks from scratch with minimum drop in performance. Inspired from the recent training set synthesis methods, we propose Differentiable Siamese Augmentation that enables effective use of data augmentation to synthesize more informative synthetic images and thus achieves better performance when training networks with augmentations. Experiments on multiple image classification benchmarks demonstrate that the proposed method obtains substantial gains over the state-of-the-art, 7% improvements on CIFAR10 and CIFAR100 datasets. We show with only less than 1% data that our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
