SceneCraft: Layout-Guided 3D Scene Generation
Xiuyu Yang, Yunze Man, Jun-Kun Chen, Yu-Xiong Wang

TL;DR
SceneCraft is a novel approach that generates detailed, multi-room indoor 3D scenes from textual descriptions and layout preferences, using a rendering-based diffusion model and neural radiance fields to produce realistic, complex environments.
Contribution
It introduces a new method combining semantic layout conversion, diffusion models, and NeRFs for large-scale indoor scene generation beyond previous small-scale limitations.
Findings
Outperforms existing methods in complex indoor scene generation
Supports multi-room environments with irregular shapes and layouts
Produces high-quality, realistic visual results
Abstract
The creation of complex 3D scenes tailored to user specifications has been a tedious and challenging task with traditional 3D modeling tools. Although some pioneering methods have achieved automatic text-to-3D generation, they are generally limited to small-scale scenes with restricted control over the shape and texture. We introduce SceneCraft, a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences provided by users. Central to our method is a rendering-based technique, which converts 3D semantic layouts into multi-view 2D proxy maps. Furthermore, we design a semantic and depth conditioned diffusion model to generate multi-view images, which are used to learn a neural radiance field (NeRF) as the final scene representation. Without the constraints of panorama image generation, we surpass previous methods in supporting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
Topics3D Modeling in Geospatial Applications · Computer Graphics and Visualization Techniques · 3D Surveying and Cultural Heritage
MethodsDiffusion
