Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation
Cl\'ement Chadebec, Onur Tasar, Eyal Benaroche, Benjamin Aubin

TL;DR
This paper introduces Flash Diffusion, a distillation method that significantly accelerates conditional diffusion models for high-quality image generation in few steps, with broad applicability and minimal training resources.
Contribution
The paper presents a novel distillation technique that drastically reduces sampling steps in diffusion models while maintaining quality, applicable across various tasks and architectures.
Findings
Achieves state-of-the-art FID and CLIP-Score in few steps
Requires only several GPU hours for training
Works across multiple tasks and backbone architectures
Abstract
In this paper, we propose an efficient, fast, and versatile distillation method to accelerate the generation of pre-trained diffusion models: Flash Diffusion. The method reaches state-of-the-art performances in terms of FID and CLIP-Score for few steps image generation on the COCO2014 and COCO2017 datasets, while requiring only several GPU hours of training and fewer trainable parameters than existing methods. In addition to its efficiency, the versatility of the method is also exposed across several tasks such as text-to-image, inpainting, face-swapping, super-resolution and using different backbones such as UNet-based denoisers (SD1.5, SDXL) or DiT (Pixart-), as well as adapters. In all cases, the method allowed to reduce drastically the number of sampling steps while maintaining very high-quality image generation. The official implementation is available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Medical Image Segmentation Techniques
MethodsDiffusion
