SceneComposer: Any-Level Semantic Image Synthesis
Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason, Kuen, Vishal M. Patel

TL;DR
SceneComposer introduces a flexible framework for semantic image synthesis that seamlessly integrates text and shape information at various precision levels, enabling diverse user control and workflow integration.
Contribution
It presents a novel multi-level semantic image synthesis framework supporting variable precision inputs, along with innovative data collection, encoding techniques, and a multi-scale diffusion model.
Findings
High-quality image generation from diverse semantic layouts
Effective handling of different precision levels in input layouts
Favorable comparison with existing methods
Abstract
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes. More specifically, the input layout consists of one or more semantic regions with free-form text descriptions and adjustable precision levels, which can be set based on the desired controllability. The framework naturally reduces to text-to-image (T2I) at the lowest level with no shape information, and it becomes segmentation-to-image (S2I) at the highest level. By supporting the levels in-between, our framework is flexible in assisting users of different drawing expertise and at different stages of their creative workflow. We introduce several novel techniques to address the challenges coming with this new setup, including a pipeline for collecting training data; a precision-encoded mask pyramid and a text feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction
MethodsTest · Diffusion
