DisCoScene: Spatially Disentangled Generative Radiance Fields for   Controllable 3D-aware Scene Synthesis

Yinghao Xu; Menglei Chai; Zifan Shi; Sida Peng; Ivan Skorokhodov,; Aliaksandr Siarohin; Ceyuan Yang; Yujun Shen; Hsin-Ying Lee; Bolei Zhou,; Sergey Tulyakov

arXiv:2212.11984·cs.CV·December 23, 2022·1 cites

DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis

Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov,, Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou,, Sergey Tulyakov

PDF

Open Access

TL;DR

DisCoScene introduces a novel 3D-aware generative model that uses simple 3D bounding box layouts to synthesize complex, controllable scenes with high fidelity, enabling effective scene editing and composition.

Contribution

It proposes a scene synthesis method that leverages object-level 3D bounding boxes as scene priors, enhancing control and disentanglement in 3D scene generation.

Findings

01

Achieves state-of-the-art results on multiple scene datasets.

02

Demonstrates high-quality object and scene editing capabilities.

03

Efficiently composes objects and backgrounds into complete scenes.

Abstract

Existing 3D-aware image synthesis approaches mainly focus on generating a single canonical object and show limited capacity in composing a complex scene containing a variety of objects. This work presents DisCoScene: a 3Daware generative model for high-quality and controllable scene synthesis. The key ingredient of our method is a very abstract object-level representation (i.e., 3D bounding boxes without semantic annotation) as the scene layout prior, which is simple to obtain, general to describe various scene contents, and yet informative to disentangle objects and background. Moreover, it serves as an intuitive user control for scene editing. Based on such a prior, the proposed model spatially disentangles the whole scene into object-centric generative radiance fields by learning on only 2D images with the global-local discrimination. Our model obtains the generation fidelity and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Advanced Vision and Imaging