Label-Augmented Dataset Distillation

Seoungyoon Kang; Youngsun Lim; Hyunjung Shim

arXiv:2409.16239·cs.CV·September 25, 2024

Label-Augmented Dataset Distillation

Seoungyoon Kang, Youngsun Lim, Hyunjung Shim

PDF

Open Access

TL;DR

This paper introduces Label-Augmented Dataset Distillation (LADD), a framework that enhances dataset distillation by adding dense label information, leading to significant improvements in training efficiency, accuracy, and robustness across various datasets and algorithms.

Contribution

LADD is a novel dataset distillation method that incorporates label augmentation with dense labels, improving performance and robustness with minimal storage overhead.

Findings

01

LADD achieves an average of 14.9% accuracy improvement.

02

LADD requires only 2.5% additional storage for dense labels.

03

LADD enhances cross-architecture robustness of distilled datasets.

Abstract

Traditional dataset distillation primarily focuses on image representation while often overlooking the important role of labels. In this study, we introduce Label-Augmented Dataset Distillation (LADD), a new dataset distillation framework enhancing dataset distillation with label augmentations. LADD sub-samples each synthetic image, generating additional dense labels to capture rich semantics. These dense labels require only a 2.5% increase in storage (ImageNet subsets) with significant performance benefits, providing strong learning signals. Our label generation strategy can complement existing dataset distillation methods for significantly enhancing their training efficiency and performance. Experimental results demonstrate that LADD outperforms existing methods in terms of computational overhead and accuracy. With three high-performance dataset distillation algorithms, LADD achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification