SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Chengyou Jia, Minnan Luo, Zhuohang Dang, Guang Dai, Xiaojun Chang,, Mengmeng Wang, Jingdong Wang

TL;DR
This paper introduces SSMG, a novel diffusion model guided by spatial-semantic feature maps for improved free-form layout-to-image generation, enhancing control and quality over previous methods.
Contribution
The paper proposes a new spatial-semantic map guided diffusion model with relation-sensitive and location-sensitive attention mechanisms for superior scene image generation.
Findings
Achieves state-of-the-art results in fidelity, diversity, and controllability.
Outperforms previous methods in spatial and semantic control.
Demonstrates high-quality, realistic scene image generation from layouts.
Abstract
Despite significant progress in Text-to-Image (T2I) generative models, even lengthy and complex text descriptions still struggle to convey detailed controls. In contrast, Layout-to-Image (L2I) generation, aiming to generate realistic and complex scene images from user-specified layouts, has risen to prominence. However, existing methods transform layout information into tokens or RGB images for conditional control in the generative process, leading to insufficient spatial and semantic controllability of individual instances. To address these limitations, we propose a novel Spatial-Semantic Map Guided (SSMG) diffusion model that adopts the feature map, derived from the layout, as guidance. Owing to rich spatial and semantic information encapsulated in well-designed feature maps, SSMG achieves superior generation quality with sufficient spatial and semantic controllability compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion
