Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning

Guanjie Chen; Shirui Huang; Kai Liu; Jianchen Zhu; Xiaoye Qu; Peng Chen; Yu Cheng; Yifu Sun

arXiv:2511.20549·cs.CV·November 26, 2025

Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning

Guanjie Chen, Shirui Huang, Kai Liu, Jianchen Zhu, Xiaoye Qu, Peng Chen, Yu Cheng, Yifu Sun

PDF

Open Access

TL;DR

Flash-DMD introduces a fast, stable, and high-quality image generation framework that combines efficient timestep-aware distillation with joint reinforcement learning, significantly reducing training costs and improving generation fidelity.

Contribution

It proposes a novel distillation strategy and joint RL training scheme that enhance efficiency, stability, and quality in few-step diffusion-based image generation.

Findings

01

Outperforms DMD2 with only 2.1% of its training cost.

02

Achieves state-of-the-art quality in few-step sampling.

03

Demonstrates improved visual quality and human preference metrics.

Abstract

Diffusion Models have emerged as a leading class of generative models, yet their iterative sampling process remains computationally expensive. Timestep distillation is a promising technique to accelerate generation, but it often requires extensive training and leads to image quality degradation. Furthermore, fine-tuning these distilled models for specific objectives, such as aesthetic appeal or user preference, using Reinforcement Learning (RL) is notoriously unstable and easily falls into reward hacking. In this work, we introduce Flash-DMD, a novel framework that enables fast convergence with distillation and joint RL-based refinement. Specifically, we first propose an efficient timestep-aware distillation strategy that significantly reduces training cost with enhanced realism, outperforming DMD2 with only $2.1%$ its training cost. Second, we introduce a joint training scheme where…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Visual Attention and Saliency Detection