3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

Zutao Jiang; Guansong Lu; Xiaodan Liang; Jihua Zhu; Wei Zhang; Xiaojun; Chang; Hang Xu

arXiv:2212.01103·cs.CV·August 17, 2023·1 cites

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation

Zutao Jiang, Guansong Lu, Xiaodan Liang, Jihua Zhu, Wei Zhang, Xiaojun, Chang, Hang Xu

PDF

Open Access 1 Video

TL;DR

This paper introduces 3D-TOGO, a novel model for text-guided cross-category 3D object generation that produces textured 3D neural radiance fields without time-consuming per-case optimization.

Contribution

The paper presents the first generic approach combining text-to-views and views-to-3D modules for efficient, high-quality 3D object generation guided by captions across multiple categories.

Findings

01

Outperforms existing methods in PSNR, SSIM, LPIPS, and CLIP-score.

02

Generates textured 3D objects without per-case optimization.

03

Effective across 98 categories in the ABO dataset.

Abstract

Text-guided 3D object generation aims to generate 3D objects described by user-defined captions, which paves a flexible way to visualize what we imagined. Although some works have been devoted to solving this challenging task, these works either utilize some explicit 3D representations (e.g., mesh), which lack texture and require post-processing for rendering photo-realistic views; or require individual time-consuming optimization for every single case. Here, we make the first attempt to achieve generic text-guided cross-category 3D object generation via a new 3D-TOGO model, which integrates a text-to-views generation module and a views-to-3D generation module. The text-to-views generation module is designed to generate different views of the target 3D object given an input caption. prior-guidance, caption-guidance and view contrastive learning are proposed for achieving better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation· underline

Taxonomy

Topics3D Surveying and Cultural Heritage · Image Processing and 3D Reconstruction · Computer Graphics and Visualization Techniques

MethodsContrastive Learning