Elucidating the Design Space of Dataset Condensation
Shitong Shao, Zikai Zhou, Huanran Chen, Zhiqiang Shen

TL;DR
This paper introduces a new framework called Elucidate Dataset Condensation (EDC) that improves the efficiency and effectiveness of dataset condensation, achieving state-of-the-art results on large-scale datasets like ImageNet-1k.
Contribution
The paper proposes a comprehensive design framework for dataset condensation, including strategies like soft category-aware matching and learning rate adjustments, setting new benchmarks for small and large datasets.
Findings
EDC achieves 48.6% accuracy on ImageNet-1k with ResNet-18 at IPC 10.
EDC outperforms previous methods SRe2L, G-VBSM, and RDED by significant margins.
The approach is grounded in empirical evidence and theoretical analysis.
Abstract
Dataset condensation, a concept within data-centric learning, efficiently transfers critical attributes from an original dataset to a synthetic version, maintaining both diversity and realism. This approach significantly improves model training efficiency and is adaptable across multiple application areas. Previous methods in dataset condensation have faced challenges: some incur high computational costs which limit scalability to larger datasets (e.g., MTT, DREAM, and TESLA), while others are restricted to less optimal design spaces, which could hinder potential improvements, especially in smaller datasets (e.g., SRe2L, G-VBSM, and RDED). To address these limitations, we propose a comprehensive design framework that includes specific, effective strategies like implementing soft category-aware matching and adjusting the learning rate schedule. These strategies are grounded in empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Neural Networks and Applications
