High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick, Esser, Bj\"orn Ommer

TL;DR
This paper introduces latent diffusion models that generate high-resolution images efficiently by operating in a compressed latent space, enabling flexible conditioning and achieving state-of-the-art results with reduced computational costs.
Contribution
The authors propose a novel latent space diffusion approach that improves visual fidelity, reduces training complexity, and enhances conditioning capabilities compared to previous pixel-based models.
Findings
Achieved state-of-the-art in image inpainting.
Performed competitively in image synthesis and super-resolution.
Reduced computational requirements significantly.
Abstract
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a guiding mechanism to control the image generation process without retraining. However, since these models typically operate directly in pixel space, optimization of powerful DMs often consumes hundreds of GPU days and inference is expensive due to sequential evaluations. To enable DM training on limited computational resources while retaining their quality and flexibility, we apply them in the latent space of powerful pretrained autoencoders. In contrast to previous work, training diffusion models on such a representation allows for the first time to reach a near-optimal point between complexity reduction and detail preservation, greatly boosting visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗stabilityai/stable-diffusion-xl-base-1.0model· 2.0M dl· ♡ 75792.0M dl♡ 7579
- 🤗stable-diffusion-v1-5/stable-diffusion-v1-5model· 1.7M dl· ♡ 10661.7M dl♡ 1066
- 🤗CompVis/stable-diffusion-v1-4model· 468k dl· ♡ 6991468k dl♡ 6991
- 🤗CompVis/stable-diffusion-v-1-4-originalmodel· ♡ 2843♡ 2843
- 🤗stabilityai/stable-diffusion-xl-refiner-1.0model· 259k dl· ♡ 2030259k dl♡ 2030
- 🤗stabilityai/sdxl-vaemodel· 288k dl· ♡ 735288k dl♡ 735
- 🤗CompVis/ldm-celebahq-256model· 1.7k dl· ♡ 511.7k dl♡ 51
- 🤗rinna/japanese-stable-diffusionmodel· 15 dl· ♡ 17815 dl♡ 178
- 🤗jm12138/riffusion-model-v1model· ♡ 3♡ 3
- 🤗Intel/ldm3dmodel· 28 dl· ♡ 6428 dl♡ 64
Videos
Stable Diffusion Is Getting Outrageously Good!· youtube
How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED· youtube
Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models | ML Coding Series· youtube
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Neural Network Applications
MethodsHow do I recover my Robinhood account? · Why can't I login to Robinhood? · How can I login to my Robinhood account? · How do I unlock my Robinhood account? ReGAin^AccEsS^nOW · Latent Diffusion Model · Diffusion · Inpainting
