X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap   Between Text-to-2D and Text-to-3D Generation

Yiwei Ma; Yijun Fan; Jiayi Ji; Haowei Wang; Xiaoshuai Sun; Guannan; Jiang; Annan Shu; Rongrong Ji

arXiv:2312.00085·cs.CV·July 31, 2024·2 cites

X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation

Yiwei Ma, Yijun Fan, Jiayi Ji, Haowei Wang, Xiaoshuai Sun, Guannan, Jiang, Annan Shu, Rongrong Ji

PDF

Open Access 1 Repo

TL;DR

X-Dreamer introduces a novel method that bridges the domain gap between 2D and 3D generation, improving the quality and accuracy of text-to-3D content by incorporating camera guidance and attention-mask alignment.

Contribution

The paper proposes two innovative components, CG-LoRA and AMA loss, to enhance 3D content creation by effectively aligning 2D diffusion models with 3D representations.

Findings

01

Outperforms existing text-to-3D methods in quality and accuracy.

02

Effectively incorporates camera information into diffusion models.

03

Focuses on foreground object detail and alignment.

Abstract

In recent times, automatic text-to-3D content creation has made significant progress, driven by the development of pretrained 2D diffusion models. Existing text-to-3D methods typically optimize the 3D representation to ensure that the rendered image aligns well with the given text, as evaluated by the pretrained 2D diffusion model. Nevertheless, a substantial domain gap exists between 2D images and 3D assets, primarily attributed to variations in camera-related attributes and the exclusive presence of foreground objects. Consequently, employing 2D diffusion models directly for optimizing 3D representations may lead to suboptimal outcomes. To address this issue, we present X-Dreamer, a novel approach for high-quality text-to-3D content creation that effectively bridges the gap between text-to-2D and text-to-3D synthesis. The key components of X-Dreamer are two innovative designs:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xmu-xiaoma666/X-Dreamer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques

MethodsDiffusion