RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
Jaidev Shriram, Alex Trevithick, Lingjie Liu, Ravi Ramamoorthi

TL;DR
RealmDreamer is a novel text-to-3D scene generation method that uses diffusion models for inpainting and depth estimation, enabling high-quality, style-diverse 3D synthesis from text or a single image without requiring multi-view data.
Contribution
It introduces a diffusion-based framework for 3D scene synthesis that leverages inpainting and depth models, eliminating the need for multi-view data and enabling single-image 3D generation.
Findings
Outperforms existing methods with 88-95% user preference.
Can generate diverse high-quality 3D scenes from text or a single image.
Does not require video or multi-view data for training.
Abstract
We introduce RealmDreamer, a technique for generating forward-facing 3D scenes from text descriptions. Our method optimizes a 3D Gaussian Splatting representation to match complex text prompts using pretrained diffusion models. Our key insight is to leverage 2D inpainting diffusion models conditioned on an initial scene estimate to provide low variance supervision for unknown regions during 3D distillation. In conjunction, we imbue high-fidelity geometry with geometric distillation from a depth diffusion model, conditioned on samples from the inpainting model. We find that the initialization of the optimization is crucial, and provide a principled methodology for doing so. Notably, our technique doesn't require video or multi-view data and can synthesize various high-quality 3D scenes in different styles with complex layouts. Further, the generality of our method allows 3D synthesis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction
MethodsInpainting · Diffusion
