Label-Augmented Dataset Distillation
Seoungyoon Kang, Youngsun Lim, Hyunjung Shim

TL;DR
This paper introduces Label-Augmented Dataset Distillation (LADD), a framework that enhances dataset distillation by adding dense label information, leading to significant improvements in training efficiency, accuracy, and robustness across various datasets and algorithms.
Contribution
LADD is a novel dataset distillation method that incorporates label augmentation with dense labels, improving performance and robustness with minimal storage overhead.
Findings
LADD achieves an average of 14.9% accuracy improvement.
LADD requires only 2.5% additional storage for dense labels.
LADD enhances cross-architecture robustness of distilled datasets.
Abstract
Traditional dataset distillation primarily focuses on image representation while often overlooking the important role of labels. In this study, we introduce Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations. LADD sub-samples each synthetic image, generating additional dense labels to capture rich semantics. These dense labels require only a 2.5% increase in storage (ImageNet subsets) with significant performance benefits, providing strong learning signals. Our label generation strategy can complement existing dataset distillation methods for significantly enhancing their training efficiency and performance. Experimental results demonstrate that LADD outperforms existing methods in terms of computational overhead and accuracy. With three high-performance dataset distillation algorithms, LADD achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
