Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Mingzhuo Li; Guang Li; Jiafeng Mao; Takahiro Ogawa; Miki Haseyama

arXiv:2505.19469·cs.LG·May 27, 2025

Diversity-Driven Generative Dataset Distillation Based on Diffusion Model with Self-Adaptive Memory

Mingzhuo Li, Guang Li, Jiafeng Mao, Takahiro Ogawa, Miki Haseyama

PDF

Open Access

TL;DR

This paper introduces a novel diversity-driven dataset distillation method using diffusion models and self-adaptive memory to produce more representative and diverse datasets, improving downstream task performance.

Contribution

The paper proposes a new diffusion-based dataset distillation approach with self-adaptive memory to enhance diversity and representativeness of distilled datasets.

Findings

01

Outperforms existing methods in most scenarios

02

Produces more diverse and representative datasets

03

Improves downstream validation accuracy

Abstract

Dataset distillation enables the training of deep neural networks with comparable performance in significantly reduced time by compressing large datasets into small and representative ones. Although the introduction of generative models has made great achievements in this field, the distributions of their distilled datasets are not diverse enough to represent the original ones, leading to a decrease in downstream validation accuracy. In this paper, we present a diversity-driven generative dataset distillation method based on a diffusion model to solve this problem. We introduce self-adaptive memory to align the distribution between distilled and real datasets, assessing the representativeness. The degree of alignment leads the diffusion model to generate more diverse datasets during the distillation process. Extensive experiments show that our method outperforms existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMetaheuristic Optimization Algorithms Research

MethodsDiffusion · ALIGN