DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Zhiqiang Shen; Ammar Sherif; Zeyuan Yin; Shitong Shao

arXiv:2411.19946·cs.CV·June 10, 2025

DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation

Zhiqiang Shen, Ammar Sherif, Zeyuan Yin, Shitong Shao

PDF

Open Access 1 Repo

TL;DR

DELT introduces a simple yet effective diversity-driven training scheme for dataset distillation that partitions data, employs local optimizations, and improves diversity, generalization, and efficiency over previous methods.

Contribution

The paper proposes DELT, a novel diversity-driven training approach that enhances dataset distillation by partitioning data and reducing synthesis time, outperforming prior state-of-the-art methods.

Findings

01

Outperforms previous methods by 2-5% on average across datasets.

02

Increases class diversity by more than 5%.

03

Reduces synthesis time by up to 39.3%.

Abstract

Recent advances in dataset distillation have led to solutions in two main directions. The conventional batch-to-batch matching mechanism is ideal for small-scale datasets and includes bi-level optimization methods on models and syntheses, such as FRePo, RCIG, and RaT-BPTT, as well as other methods like distribution matching, gradient matching, and weight trajectory matching. Conversely, batch-to-global matching typifies decoupled methods, which are particularly advantageous for large-scale datasets. This approach has garnered substantial interest within the community, as seen in SRe $^{2}$ L, G-VBSM, WMDD, and CDA. A primary challenge with the second approach is the lack of diversity among syntheses within each class since samples are optimized independently and the same global supervision signals are reused across different synthetic images. In this study, we propose a new Diversity-driven…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vila-lab/delt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Neural Networks and Applications