Dataset Condensation via Generative Model
David Junhao Zhang, Heng Wang, Chuhui Xue, Rui Yan, Wenqing Zhang,, Song Bai, Mike Zheng Shou

TL;DR
This paper introduces a novel dataset condensation method that uses a generative model to efficiently condense large datasets, including ImageNet-1k, by maintaining diversity and discriminability among condensed samples.
Contribution
The paper proposes a new dataset condensation approach via a generative model, enabling scalable condensation and improved sample diversity and class separation.
Findings
Effective condensation on large datasets like ImageNet-1k.
Generative model format remains stable as dataset size grows.
Intra- and inter-class losses improve sample diversity and discriminability.
Abstract
Dataset condensation aims to condense a large dataset with a lot of training samples into a small set. Previous methods usually condense the dataset into the pixels format. However, it suffers from slow optimization speed and large number of parameters to be optimized. When increasing image resolutions and classes, the number of learnable parameters grows accordingly, prohibiting condensation methods from scaling up to large datasets with diverse classes. Moreover, the relations among condensed samples have been neglected and hence the feature distribution of condensed samples is often not diverse. To solve these problems, we propose to condense the dataset into another format, a generative model. Such a novel format allows for the condensation of large datasets because the size of the generative model remains relatively stable as the number of classes or image resolution increases.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Imaging for Blood Diseases · AI in cancer detection
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
