ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score   Distillation

Zhiyuan Ma; Yuxiang Wei; Yabin Zhang; Xiangyu Zhu; Zhen Lei; Lei Zhang

arXiv:2407.02040·cs.CV·July 3, 2024

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Zhiyuan Ma, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang

PDF

Open Access 1 Repo

TL;DR

ScaleDreamer introduces Asynchronous Score Distillation, a stable and scalable method that leverages early diffusion timesteps to efficiently synthesize high-quality 3D content from large text prompt datasets without retraining diffusion models.

Contribution

The paper proposes ASD, a novel score distillation approach that improves scalability and stability for text-to-3D synthesis by shifting diffusion timesteps, maintaining diffusion model integrity.

Findings

01

ASD scales up to 100k prompts effectively.

02

Achieves high-quality 3D content synthesis.

03

Demonstrates superior prompt consistency.

Abstract

By leveraging the text-to-image diffusion priors, score distillation can synthesize 3D contents without paired text-3D training data. Instead of spending hours of online optimization per text prompt, recent studies have been focused on learning a text-to-3D generative network for amortizing multiple text-3D relations, which can synthesize 3D contents in seconds. However, existing score distillation methods are hard to scale up to a large amount of text prompts due to the difficulties in aligning pretrained diffusion prior with the distribution of rendered images from various text prompts. Current state-of-the-arts such as Variational Score Distillation finetune the pretrained diffusion model to minimize the noise prediction error so as to align the distributions, which are however unstable to train and will impair the model's comprehension capability to numerous text prompts. Based on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

theericma/scaledreamer
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModular Robots and Swarm Intelligence · Music Technology and Sound Studies · Interactive and Immersive Displays

MethodsDiffusion · ALIGN