Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal, Alex Nichol

TL;DR
Diffusion models surpass GANs in image synthesis quality, achieving state-of-the-art results on ImageNet with improved architecture and classifier guidance, while maintaining efficiency and better distribution coverage.
Contribution
The paper introduces improved diffusion architectures and a classifier guidance method that significantly enhance image synthesis quality over GANs and previous diffusion models.
Findings
Achieved FID of 2.97 on ImageNet 128x128
Matched BigGAN-deep performance with fewer passes
Enhanced image quality with classifier guidance and upsampling diffusion
Abstract
We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for fidelity using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128128, 4.59 on ImageNet 256256, and 7.72 on ImageNet 512512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.94 on ImageNet 256256 and 3.85 on ImageNet 512512. We release our code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Cell Image Analysis Techniques
Methodsclassifier-guidance · Diffusion · Adam · Convolution · Residual Connection · Batch Normalization
