Draw Like an Artist: Complex Scene Generation with Diffusion Model via Composition, Painting, and Retouching
Minghao Liu, Le Zhang, Yingjie Tian, Xiaochao Qu, Luoqi Liu, Ting Liu

TL;DR
This paper introduces a training-free diffusion framework called Complex Diffusion (CxD) that mimics the artist's process of creating complex scenes through composition, painting, and retouching, leveraging language models for prompt decomposition and scene management.
Contribution
The paper presents a novel three-stage diffusion-based method for complex scene generation, guided by a new definition and decomposition criteria, without requiring additional training.
Findings
Outperforms previous SOTA methods in complex scene generation
Produces high-quality, semantically consistent, and diverse images
Effectively manages intricate prompts for detailed scene creation
Abstract
Recent advances in text-to-image diffusion models have demonstrated impressive capabilities in image quality. However, complex scene generation remains relatively unexplored, and even the definition of `complex scene' itself remains unclear. In this paper, we address this gap by providing a precise definition of complex scenes and introducing a set of Complex Decomposition Criteria (CDC) based on this definition. Inspired by the artists painting process, we propose a training-free diffusion framework called Complex Diffusion (CxD), which divides the process into three stages: composition, painting, and retouching. Our method leverages the powerful chain-of-thought capabilities of large language models (LLMs) to decompose complex prompts based on CDC and to manage composition and layout. We then develop an attention modulation method that guides simple prompts to specific regions to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
MethodsSoftmax · Attention Is All You Need · Sparse Evolutionary Training · Diffusion
