IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation

Chenru Wang; Yunyi Chen; Zijun Yang; Joey Tianyi Zhou; Chi Zhang

arXiv:2603.13960·cs.CV·March 17, 2026

IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation

Chenru Wang, Yunyi Chen, Zijun Yang, Joey Tianyi Zhou, Chi Zhang

PDF

Open Access

TL;DR

This paper introduces IMS3, a novel diffusion-based dataset distillation method that improves diversity and class separation in synthetic datasets, leading to better generalization and state-of-the-art results.

Contribution

The paper proposes Inversion-Matching and Selective Subgroup Sampling strategies to address distributional coverage and class separability issues in diffusion-based dataset distillation.

Findings

01

Enhanced dataset diversity and coverage.

02

Improved inter-class separability.

03

Achieved state-of-the-art performance among diffusion methods.

Abstract

Dataset Distillation aims to synthesize compact datasets that can approximate the training efficacy of large-scale real datasets, offering an efficient solution to the increasing computational demands of modern deep learning. Recently, diffusion-based dataset distillation methods have shown great promise by leveraging the strong generative capacity of diffusion models to produce diverse and structurally consistent samples. However, a fundamental goal misalignment persists: diffusion models are optimized for generative likelihood rather than discriminative utility, resulting in over-concentration in high-density regions and inadequate coverage of boundary samples crucial for classification. To address this issue, we propose two complementary strategies. Inversion-Matching (IM) introduces an inversion-guided fine-tuning process that aligns denoising trajectories with their inversion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Domain Adaptation and Few-Shot Learning