SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene   Generation

Alexey Bokhovkin; Quan Meng; Shubham Tulsiani; Angela Dai

arXiv:2412.01801·cs.CV·December 4, 2024

SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Alexey Bokhovkin, Quan Meng, Shubham Tulsiani, Angela Dai

PDF

Open Access

TL;DR

SceneFactor introduces a diffusion-based method for large-scale, controllable 3D scene generation and editing by leveraging factored semantic and geometric manifolds, enabling intuitive manipulation of generated scenes.

Contribution

The paper proposes a novel factored diffusion formulation that allows for controllable and editable 3D scene synthesis using semantic 3D boxes as proxies.

Findings

01

Enables text-guided 3D scene synthesis with controllability.

02

Allows intuitive, localized editing of 3D scenes via semantic proxies.

03

Demonstrates high-fidelity 3D scene generation with effective editing capabilities.

Abstract

We present SceneFactor, a diffusion-based approach for large-scale 3D scene generation that enables controllable generation and effortless editing. SceneFactor enables text-guided 3D scene synthesis through our factored diffusion formulation, leveraging latent semantic and geometric manifolds for generation of arbitrary-sized 3D scenes. While text input enables easy, controllable generation, text guidance remains imprecise for intuitive, localized editing and manipulation of the generated 3D scenes. Our factored semantic diffusion generates a proxy semantic space composed of semantic 3D boxes that enables controllable editing of generated scenes by adding, removing, changing the size of the semantic 3D proxy boxes that guides high-fidelity, consistent 3D geometric editing. Extensive experiments demonstrate that our approach enables high-fidelity 3D scene synthesis with effective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Human Motion and Animation

MethodsDiffusion