ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation
Zhiyuan Ma, Yuxiang Wei, Yabin Zhang, Xiangyu Zhu, Zhen Lei, Lei Zhang

TL;DR
ScaleDreamer introduces Asynchronous Score Distillation, a stable and scalable method that leverages early diffusion timesteps to efficiently synthesize high-quality 3D content from large text prompt datasets without retraining diffusion models.
Contribution
The paper proposes ASD, a novel score distillation approach that improves scalability and stability for text-to-3D synthesis by shifting diffusion timesteps, maintaining diffusion model integrity.
Findings
ASD scales up to 100k prompts effectively.
Achieves high-quality 3D content synthesis.
Demonstrates superior prompt consistency.
Abstract
By leveraging the text-to-image diffusion priors, score distillation can synthesize 3D contents without paired text-3D training data. Instead of spending hours of online optimization per text prompt, recent studies have been focused on learning a text-to-3D generative network for amortizing multiple text-3D relations, which can synthesize 3D contents in seconds. However, existing score distillation methods are hard to scale up to a large amount of text prompts due to the difficulties in aligning pretrained diffusion prior with the distribution of rendered images from various text prompts. Current state-of-the-arts such as Variational Score Distillation finetune the pretrained diffusion model to minimize the noise prediction error so as to align the distributions, which are however unstable to train and will impair the model's comprehension capability to numerous text prompts. Based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModular Robots and Swarm Intelligence · Music Technology and Sound Studies · Interactive and Immersive Displays
MethodsDiffusion · ALIGN
