WordRobe: Text-Guided Generation of Textured 3D Garments

Astitva Srivastava; Pranav Manu; Amit Raj; Varun Jampani; Avinash; Sharma

arXiv:2403.17541·cs.CV·July 16, 2024·1 cites

WordRobe: Text-Guided Generation of Textured 3D Garments

Astitva Srivastava, Pranav Manu, Amit Raj, Varun Jampani, Avinash, Sharma

PDF

Open Access

TL;DR

WordRobe is a novel framework that enables text-guided generation and editing of textured 3D garments with high quality and efficiency, using a combination of latent space learning and CLIP alignment.

Contribution

The paper introduces a new method for generating textured 3D garments from text prompts, featuring a novel coarse-to-fine training strategy and efficient texture synthesis with ControlNet.

Findings

01

Outperforms current state-of-the-art methods in 3D garment generation and editing.

02

Enables view-consistent texture synthesis in a single inference step.

03

Produces unposed 3D garments compatible with standard simulation pipelines.

Abstract

In this paper, we tackle a new and challenging problem of text-driven generation of 3D garments with high-quality textures. We propose "WordRobe", a novel framework for the generation of unposed & textured 3D garment meshes from user-friendly text prompts. We achieve this by first learning a latent representation of 3D garments using a novel coarse-to-fine training strategy and a loss for latent disentanglement, promoting better latent interpolation. Subsequently, we align the garment latent space to the CLIP embedding space in a weakly supervised manner, enabling text-driven 3D garment generation and editing. For appearance modeling, we leverage the zero-shot generation capability of ControlNet to synthesize view-consistent texture maps in a single feed-forward inference step, thereby drastically decreasing the generation time as compared to existing methods. We demonstrate superior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · 3D Shape Modeling and Analysis · Human Motion and Animation

MethodsContrastive Language-Image Pre-training · ALIGN