DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D   Data

Qihao Liu; Yi Zhang; Song Bai; Adam Kortylewski; Alan Yuille

arXiv:2406.04322·cs.CV·June 10, 2024·1 cites

DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data

Qihao Liu, Yi Zhang, Song Bai, Adam Kortylewski, Alan Yuille

PDF

Open Access 1 Repo

TL;DR

DIRECT-3D introduces a diffusion-based 3D generative model trained on noisy, unaligned data, enabling high-quality, detailed 3D asset creation from text prompts with state-of-the-art results.

Contribution

The paper presents a novel tri-plane diffusion model that automatically filters and aligns noisy 3D data during training, improving large-scale text-to-3D generation.

Findings

01

Achieves state-of-the-art performance in text-to-3D generation.

02

Generates high-resolution, realistic 3D objects in seconds.

03

Can serve as a 3D prior to improve other methods.

Abstract

We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets (represented by Neural Radiance Fields) from text prompts. Unlike recent 3D generative models that rely on clean and well-aligned 3D data, limiting them to single or few-class generation, our model is directly trained on extensive noisy and unaligned `in-the-wild' 3D assets, mitigating the key challenge (i.e., data scarcity) in large-scale 3D generation. In particular, DIRECT-3D is a tri-plane diffusion model that integrates two innovations: 1) A novel learning framework where noisy data are filtered and aligned automatically during the training process. Specifically, after an initial warm-up phase using a small set of clean data, an iterative optimization is introduced in the diffusion process to explicitly estimate the 3D pose of objects and select beneficial data based on conditional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qihao067/direct3d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Handwritten Text Recognition Techniques · Human Motion and Animation

MethodsSparse Evolutionary Training · Diffusion