Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with   Image-Based Surface Representation

Slava Elizarov; Ciara Rowles; Simon Donn\'e

arXiv:2409.03718·cs.CV·September 6, 2024

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Slava Elizarov, Ciara Rowles, Simon Donn\'e

PDF

Open Access 1 Video

TL;DR

Geometry Image Diffusion (GIMDiffusion) offers a fast, data-efficient method for text-to-3D generation by leveraging 2D image representations of 3D shapes, enabling high-quality 3D asset creation with limited data.

Contribution

We introduce GIMDiffusion, a novel approach that uses geometry images for efficient 3D shape representation, reducing data requirements and computational costs compared to traditional methods.

Findings

01

Enables fast 3D generation comparable to Text-to-Image models

02

Produces semantically meaningful, part-aware 3D objects

03

Operates effectively with limited 3D training data

Abstract

Generating high-quality 3D objects from textual descriptions remains a challenging problem due to computational cost, the scarcity of 3D data, and complex 3D representations. We introduce Geometry Image Diffusion (GIMDiffusion), a novel Text-to-3D model that utilizes geometry images to efficiently represent 3D shapes using 2D images, thereby avoiding the need for complex 3D-aware architectures. By integrating a Collaborative Control mechanism, we exploit the rich 2D priors of existing Text-to-Image models such as Stable Diffusion. This enables strong generalization even with limited 3D training data (allowing us to use only high-quality training data) as well as retaining compatibility with guidance techniques such as IPAdapter. In short, GIMDiffusion enables the generation of 3D assets at speeds comparable to current Text-to-Image models. The generated objects consist of semantically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation· slideslive

Taxonomy

TopicsHandwritten Text Recognition Techniques · Image Processing and 3D Reconstruction · Image Retrieval and Classification Techniques

MethodsDiffusion