HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise   Estimation

Jinbo Wu; Xiaobo Gao; Xing Liu; Zhengyang Shen; Chen Zhao; and Haocheng Feng; Jingtuo Liu; Errui Ding

arXiv:2307.16183·cs.CV·August 1, 2023

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation

Jinbo Wu, Xiaobo Gao, Xing Liu, Zhengyang Shen, Chen Zhao, and Haocheng Feng, Jingtuo Liu, Errui Ding

PDF

Open Access 1 Video

TL;DR

This paper introduces HD-Fusion, a novel method that combines multiple noise estimations with 2D diffusion priors to improve the quality and detail of text-to-3D model generation, enabling higher resolution outputs.

Contribution

It presents a new approach integrating multiple noise estimation processes with pretrained 2D diffusion priors for enhanced 3D content generation from text.

Findings

01

Produces higher quality 3D models with more detail

02

Outperforms baseline methods in quality metrics

03

Enables higher resolution 3D rendering

Abstract

In this paper, we study Text-to-3D content generation leveraging 2D diffusion priors to enhance the quality and detail of the generated 3D models. Recent progress (Magic3D) in text-to-3D has shown that employing high-resolution (e.g., 512 x 512) renderings can lead to the production of high-quality 3D models using latent diffusion priors. To enable rendering at even higher resolutions, which has the potential to further augment the quality and detail of the models, we propose a novel approach that combines multiple noise estimation processes with a pretrained 2D diffusion prior. Distinct from the Bar-Tal et al.s' study which binds multiple denoised results to generate images from texts, our approach integrates the computation of scoring distillation losses such as SDS loss and VSD loss which are essential techniques for the 3D content generation with 2D diffusion priors. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation· youtube

Taxonomy

TopicsComputer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction

MethodsDiffusion