Deep Geometric Moments Promote Shape Consistency in Text-to-3D   Generation

Utkarsh Nath; Rajeev Goel; Eun Som Jeon; Changhoon Kim; Kyle Min,; Yezhou Yang; Yingzhen Yang; Pavan Turaga

arXiv:2408.05938·cs.CV·January 23, 2025

Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation

Utkarsh Nath, Rajeev Goel, Eun Som Jeon, Changhoon Kim, Kyle Min,, Yezhou Yang, Yingzhen Yang, Pavan Turaga

PDF

Open Access

TL;DR

This paper introduces MT3D, a novel text-to-3D generation model that uses depth maps and geometric moments from high-quality 3D models to improve shape consistency and reduce viewpoint bias in generated objects.

Contribution

MT3D combines depth-guided control and deep geometric moments to explicitly enforce shape and geometric consistency in text-to-3D generation, addressing viewpoint bias issues.

Findings

01

Reduces viewpoint bias and geometric inconsistencies in generated 3D objects.

02

Produces more diverse and geometrically accurate 3D assets.

03

Improves overall quality and usability of text-to-3D models.

Abstract

To address the data scarcity associated with 3D assets, 2D-lifting techniques such as Score Distillation Sampling (SDS) have become a widely adopted practice in text-to-3D generation pipelines. However, the diffusion models used in these techniques are prone to viewpoint bias and thus lead to geometric inconsistencies such as the Janus problem. To counter this, we introduce MT3D, a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias and explicitly infuse geometric understanding into the generation pipeline. Firstly, we employ depth maps derived from a high-quality 3D model as control signals to guarantee that the generated 2D images preserve the fundamental shape and structure, thereby reducing the inherent viewpoint bias. Next, we utilize deep geometric moments to ensure geometric consistency in the 3D representation explicitly. By…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Motion and Animation · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction

MethodsDiffusion