Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Hyeonsu Kim, Jaehoon Ko,, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim

TL;DR
This paper introduces 3DFuse, a framework that enhances pretrained 2D diffusion models with 3D awareness, improving the robustness and consistency of text-to-3D scene generation using score distillation.
Contribution
The paper proposes a novel method to incorporate 3D awareness into 2D diffusion models, enabling more stable and 3D-consistent text-to-3D generation.
Findings
Improved 3D consistency in generated scenes.
Enhanced robustness against errors in coarse 3D structures.
Surpassed prior methods in quality of 3D scene reconstruction.
Abstract
Text-to-3D generation has shown rapid progress in recent days with the advent of score distillation, a methodology of using pretrained text-to-2D diffusion models to optimize neural radiance field (NeRF) in the zero-shot setting. However, the lack of 3D awareness in the 2D diffusion models destabilizes score distillation-based methods from reconstructing a plausible 3D scene. To address this issue, we propose 3DFuse, a novel framework that incorporates 3D awareness into pretrained 2D diffusion models, enhancing the robustness and 3D consistency of score distillation-based methods. We realize this by first constructing a coarse 3D structure of a given text prompt and then utilizing projected, view-specific depth map as a condition for the diffusion model. Additionally, we introduce a training strategy that enables the 2D diffusion model learns to handle the errors and sparsity within the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications
MethodsDiffusion
