Generative Dataset Distillation Based on Diffusion Model

Duo Su; Junjie Hou; Guang Li; Ren Togo; Rui Song; Takahiro Ogawa; Miki; Haseyama

arXiv:2408.08610·cs.CV·August 19, 2024

Generative Dataset Distillation Based on Diffusion Model

Duo Su, Junjie Hou, Guang Li, Ren Togo, Rui Song, Takahiro Ogawa, Miki, Haseyama

PDF

Open Access 2 Repos

TL;DR

This paper introduces a high-speed diffusion model-based dataset distillation method using Stable Diffusion and SDXL-Turbo, achieving high IPC for CIFAR-100 and Tiny-ImageNet, and securing third place in ECCV 2024.

Contribution

The paper proposes a novel diffusion model-based dataset distillation technique leveraging SDXL-Turbo for high-speed image generation with improved IPC.

Findings

01

Achieved IPC=10 for Tiny-ImageNet and IPC=20 for CIFAR-100.

02

Demonstrated the effectiveness of class-based text prompts and data augmentation.

03

Secured third place in the ECCV 2024 DD Challenge.

Abstract

This paper presents our method for the generative track of The First Dataset Distillation Challenge at ECCV 2024. Since the diffusion model has become the mainstay of generative models because of its high-quality generative effects, we focus on distillation methods based on the diffusion model. Considering that the track can only generate a fixed number of images in 10 minutes using a generative model for CIFAR-100 and Tiny-ImageNet datasets, we need to use a generative model that can generate images at high speed. In this study, we proposed a novel generative dataset distillation method based on Stable Diffusion. Specifically, we use the SDXL-Turbo model which can generate images at high speed and quality. Compared to other diffusion models that can only generate images per class (IPC) = 1, our method can achieve an IPC = 10 for Tiny-ImageNet and an IPC = 20 for CIFAR-100,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProcess Optimization and Integration · Metaheuristic Optimization Algorithms Research

MethodsDiffusion · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus