Cascaded Diffusion Models for High Fidelity Image Generation
Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad, Norouzi, Tim Salimans

TL;DR
This paper introduces cascaded diffusion models that generate high-quality images by progressively increasing resolution, with conditioning augmentation crucial for preventing errors and achieving state-of-the-art results on ImageNet.
Contribution
The paper presents a novel cascaded diffusion framework with conditioning augmentation, enabling high-fidelity image generation without auxiliary classifiers, outperforming prior models on ImageNet.
Findings
Achieved FID scores of 1.48, 3.52, and 4.88 at 64x64, 128x128, and 256x256 resolutions.
Outperformed BigGAN-deep in sample quality metrics.
Achieved top-1 and top-5 classification accuracy scores of 63.02% and 84.06% at 256x256.
Abstract
We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality. A cascaded diffusion model comprises a pipeline of multiple diffusion models that generate images of increasing resolution, beginning with a standard diffusion model at the lowest resolution, followed by one or more super-resolution diffusion models that successively upsample the image and add higher resolution details. We find that the sample quality of a cascading pipeline relies crucially on conditioning augmentation, our proposed method of data augmentation of the lower resolution conditioning inputs to the super-resolution models. Our experiments show that conditioning augmentation prevents compounding error during sampling in a cascaded model, helping us to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Domain Adaptation and Few-Shot Learning
Methods((Reservation@Faqs))How do I cancel a reservation on Expedia? · Diffusion · PixelCNN · Softmax · Dense Connections · Six Ways To Communicate To Someone At Expedia Via Phone And Email's. · Adam · Feedforward Network · Convolution · Batch Normalization
