Are Large-scale Soft Labels Necessary for Large-scale Dataset Distillation?
Lingao Xiao, Yang He

TL;DR
This paper demonstrates that class-wise supervision during dataset distillation reduces the need for large soft labels by increasing diversity, enabling significant compression with performance gains.
Contribution
It introduces class-wise batching in dataset distillation, reducing soft label size and complexity while improving image diversity and performance.
Findings
40x reduction in soft label size
2.6% performance improvement
Effective soft label pruning method
Abstract
In ImageNet-condensation, the storage for auxiliary soft labels exceeds that of the condensed dataset by over 30 times. However, are large-scale soft labels necessary for large-scale dataset distillation? In this paper, we first discover that the high within-class similarity in condensed datasets necessitates the use of large-scale soft labels. This high within-class similarity can be attributed to the fact that previous methods use samples from different classes to construct a single batch for batch normalization (BN) matching. To reduce the within-class similarity, we introduce class-wise supervision during the image synthesizing process by batching the samples within classes, instead of across classes. As a result, we can increase within-class diversity and reduce the size of required soft labels. A key benefit of improved image diversity is that soft label compression can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsArtificial Immune Systems Applications · Data Stream Mining Techniques · Machine Learning and Data Classification
MethodsBatch Normalization
