DRiffusion: Draft-and-Refine Process Parallelizes Diffusion Models with Ease
Runsheng Bai, Chengyu Zhang, Yangdong Deng

TL;DR
DRiffusion introduces a parallel sampling framework for diffusion models that significantly accelerates inference with minimal quality loss, enabling faster high-fidelity content generation.
Contribution
The paper presents DRiffusion, a novel draft-and-refine parallel sampling method that accelerates diffusion inference by a factor of up to 3.7x while maintaining quality.
Findings
Achieves 1.4x to 3.7x speedup across multiple models.
Maintains comparable FID and CLIP scores to original models.
Shows only minor drops in PickScore and HPSv2.1 metrics.
Abstract
Diffusion models have achieved remarkable success in generating high-fidelity content but suffer from slow, iterative sampling, resulting in high latency that limits their use in interactive applications. We introduce DRiffusion, a parallel sampling framework that parallelizes diffusion inference through a draft-and-refine process. DRiffusion employs skip transitions to generate multiple draft states for future timesteps and computes their corresponding noises in parallel, which are then used in the standard denoising process to produce refined results. Theoretically, our method achieves an acceleration rate of or , depending on whether the conservative or aggressive mode is used, where denotes the number of devices. Empirically, DRiffusion attains 1.4-3.7 speedup across multiple diffusion models while incur minimal degradation in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
