An Object is Worth 64x64 Pixels: Generating 3D Object via Image   Diffusion

Xingguang Yan; Han-Hung Lee; Ziyu Wan; Angel X. Chang

arXiv:2408.03178·cs.CV·August 7, 2024

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion

Xingguang Yan, Han-Hung Lee, Ziyu Wan, Angel X. Chang

PDF

Open Access 1 Datasets

TL;DR

This paper presents a novel method for generating 3D models by encoding surface details into 64x64 pixel images called 'Object Images', enabling the use of image diffusion models for 3D shape creation.

Contribution

It introduces 'Object Images' as a new 2D representation for 3D shapes, simplifying complex geometry and enabling direct use of image diffusion models for 3D generation.

Findings

01

Achieves point cloud FID comparable to recent 3D models

02

Supports PBR material generation naturally

03

Addresses geometric and semantic irregularity in 3D shapes

Abstract

We introduce a new approach for generating realistic 3D models with UV maps through a representation termed "Object Images." This approach encapsulates surface geometry, appearance, and patch structures within a 64x64 pixel image, effectively converting complex 3D shapes into a more manageable 2D format. By doing so, we address the challenges of both geometric and semantic irregularity inherent in polygonal meshes. This method allows us to use image generation models, such as Diffusion Transformers, directly for 3D shape generation. Evaluated on the ABO dataset, our generated shapes with patch structures achieve point cloud FID comparable to recent 3D generative models, while naturally supporting PBR material generation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

3dlg-hcvc/omages_ABO
dataset· 790 dl
790 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques · Image Processing and 3D Reconstruction · Advanced Image and Video Retrieval Techniques

MethodsDiffusion