Semantic Image Synthesis via Diffusion Models
Wengang Zhou, Weilun Wang, Jianmin Bao, Dongdong Chen, Dong Chen, Lu Yuan, Houqiang Li

TL;DR
This paper introduces a novel diffusion model framework for semantic image synthesis that leverages separate processing of semantic layouts and noisy images, achieving state-of-the-art results in image quality and diversity.
Contribution
The proposed framework processes semantic layouts and noisy images differently using multi-layer spatially-adaptive normalization, and employs classifier-free guidance for improved synthesis quality.
Findings
Achieves state-of-the-art FID scores on benchmark datasets.
Demonstrates higher diversity in generated images compared to previous methods.
Effective semantic interpretability in image synthesis.
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks compared with Generative Adversarial Nets (GANs). Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches, which may lead to unsatisfactory quality or diversity of generated images. In this paper, we propose a novel framework based on DDPM for semantic image synthesis. Unlike previous conditional diffusion model directly feeds the semantic layout and noisy image as input to a U-Net structure, which may not fully leverage the information in the input semantic mask, our framework processes semantic layout and noisy image differently. It feeds noisy image to the encoder of the U-Net structure while the semantic layout to the decoder by multi-layer spatially-adaptive normalization operators. To further improve the generation quality and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Diffusion · Max Pooling · Convolution · Concatenated Skip Connection · U-Net
