Toward Scene Graph and Layout Guided Complex 3D Scene Generation
Yu-Hsiang Huang, Wei Wang, Sheng-Yu Huang, Yu-Chiang Frank Wang

TL;DR
GraLa3D is a novel framework that uses scene graphs and layout guidance to generate complex 3D scenes from text, improving control over object interactions and layout fidelity.
Contribution
It introduces a scene graph-based approach with super-nodes for modeling object interactions, addressing limitations of score distillation sampling methods.
Findings
Successfully generates complex 3D scenes aligned with text prompts
Models object interactions within super-nodes effectively
Overcomes appearance leakage issues in multi-object scenes
Abstract
Recent advancements in object-centric text-to-3D generation have shown impressive results. However, generating complex 3D scenes remains an open challenge due to the intricate relations between objects. Moreover, existing methods are largely based on score distillation sampling (SDS), which constrains the ability to manipulate multiobjects with specific interactions. Addressing these critical yet underexplored issues, we present a novel framework of Scene Graph and Layout Guided 3D Scene Generation (GraLa3D). Given a text prompt describing a complex 3D scene, GraLa3D utilizes LLM to model the scene using a scene graph representation with layout bounding box information. GraLa3D uniquely constructs the scene graph with single-object nodes and composite super-nodes. In addition to constraining 3D generation within the desirable layout, a major contribution lies in the modeling of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Human Motion and Animation
