Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior
Fangfu Liu, Diankun Wu, Yi Wei, Yongming Rao, Yueqi Duan

TL;DR
Sherpa3D leverages coarse 3D priors to enhance text-to-3D generation, improving quality, diversity, and geometric consistency by guiding 2D diffusion models with structural and semantic cues.
Contribution
It introduces a novel framework that exploits coarse 3D knowledge to guide 2D diffusion-based text-to-3D generation, addressing multi-face issues and enhancing output fidelity.
Findings
Outperforms state-of-the-art methods in quality and 3D consistency.
Enables high-fidelity and diverse 3D asset generation from text prompts.
Effectively mitigates view-agnostic ambiguity in 3D generation.
Abstract
Recently, 3D content creation from text prompts has demonstrated remarkable progress by utilizing 2D and 3D diffusion models. While 3D diffusion models ensure great multi-view consistency, their ability to generate high-quality and diverse 3D assets is hindered by the limited 3D data. In contrast, 2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data. However, 2D lifting methods suffer from inherent view-agnostic ambiguity thereby leading to serious multi-face Janus issues, where text prompts fail to provide sufficient guidance to learn coherent 3D results. Instead of retraining a costly viewpoint-aware model, we study how to fully exploit easily accessible coarse 3D knowledge to enhance the prompts and guide 2D lifting optimization for refinement. In this paper, we propose Sherpa3D, a new text-to-3D framework that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · Handwritten Text Recognition Techniques
MethodsDiffusion
