TL;DR
This paper introduces the Latent Denoising Diffusion GAN, a model that significantly improves inference speed and image quality by operating in a compressed latent space and employing a weighted learning strategy, outperforming previous diffusion models.
Contribution
It presents a novel latent space approach combined with weighted learning to enhance diffusion model efficiency and output quality, surpassing prior methods like DiffusionGAN and Wavelet Diffusion.
Findings
Achieves state-of-the-art speed among diffusion models.
Shows significant improvements in image quality metrics.
Demonstrates effectiveness across multiple datasets.
Abstract
Diffusion models are emerging as powerful solutions for generating high-fidelity and diverse images, often surpassing GANs under many circumstances. However, their slow inference speed hinders their potential for real-time applications. To address this, DiffusionGAN leveraged a conditional GAN to drastically reduce the denoising steps and speed up inference. Its advancement, Wavelet Diffusion, further accelerated the process by converting data into wavelet space, thus enhancing efficiency. Nonetheless, these models still fall short of GANs in terms of speed and image quality. To bridge these gaps, this paper introduces the Latent Denoising Diffusion GAN, which employs pre-trained autoencoders to compress images into a compact latent space, significantly improving inference speed and image quality. Furthermore, we propose a Weighted Learning strategy to enhance diversity and image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion
