High-Resolution Image Synthesis with Latent Diffusion Models

Robin Rombach; Andreas Blattmann; Dominik Lorenz; Patrick; Esser; Bj\"orn Ommer

arXiv:2112.10752·cs.CV·April 14, 2022·711 cites

High-Resolution Image Synthesis with Latent Diffusion Models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick, Esser, Bj\"orn Ommer

PDF

Open Access 5 Repos 10 Models 2 Datasets 3 Videos

TL;DR

This paper introduces latent diffusion models that generate high-resolution images efficiently by operating in a compressed latent space, enabling flexible conditioning and achieving state-of-the-art results with reduced computational costs.

Contribution

The authors propose a novel latent space diffusion approach that improves visual fidelity, reduces training complexity, and enhances conditioning capabilities compared to previous pixel-based models.

Findings

01

Achieved state-of-the-art in image inpainting.

02

Performed competitively in image synthesis and super-resolution.

03

Reduced computational requirements significantly.

Abstract

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a guiding mechanism to control the image generation process without retraining. However, since these models typically operate directly in pixel space, optimization of powerful DMs often consumes hundreds of GPU days and inference is expensive due to sequential evaluations. To enable DM training on limited computational resources while retaining their quality and flexibility, we apply them in the latent space of powerful pretrained autoencoders. In contrast to previous work, training diffusion models on such a representation allows for the first time to reach a near-optimal point between complexity reduction and detail preservation, greatly boosting visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

Stable Diffusion Is Getting Outrageously Good!· youtube

How does Stable Diffusion work? – Latent Diffusion Models EXPLAINED· youtube

Stable Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models | ML Coding Series· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Neural Network Applications

MethodsHow do I recover my Robinhood account? · Why can't I login to Robinhood? · How can I login to my Robinhood account? · How do I unlock my Robinhood account? ReGAin^AccEsS^nOW · Latent Diffusion Model · Diffusion · Inpainting