Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
Zhisheng Xiao, Karsten Kreis, Arash Vahdat

TL;DR
This paper introduces denoising diffusion GANs, a novel approach that significantly accelerates diffusion models by modeling denoising steps with multimodal GANs, achieving high sample quality and diversity at a fraction of the original sampling time.
Contribution
The paper proposes denoising diffusion GANs that reduce diffusion sampling time by modeling denoising steps with multimodal GANs, enabling practical real-world application.
Findings
Achieves 2000x faster sampling on CIFAR-10.
Maintains competitive sample quality and diversity.
Outperforms traditional GANs in mode coverage.
Abstract
A wide variety of deep generative models has been developed in the past decade. Yet, these models often struggle with simultaneously addressing three key requirements including: high sample quality, mode coverage, and fast sampling. We call the challenge imposed by these requirements the generative learning trilemma, as the existing models often trade some of them for others. Particularly, denoising diffusion models have shown impressive sample quality and diversity, but their expensive sampling does not yet allow them to be applied in many real-world applications. In this paper, we argue that slow sampling in these models is fundamentally attributed to the Gaussian assumption in the denoising step which is justified only for small step sizes. To enable denoising with large steps, and hence, to reduce the total number of denoising steps, we propose to model the denoising distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing
MethodsDiffusion
