SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed   and Textured Human Meshes

Soubhik Sanyal; Partha Ghosh; Jinlong Yang; Michael J. Black; Justus; Thies; Timo Bolkart

arXiv:2308.10638·cs.CV·May 7, 2024

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes

Soubhik Sanyal, Partha Ghosh, Jinlong Yang, Michael J. Black, Justus, Thies, Timo Bolkart

PDF

Open Access

TL;DR

SCULPT introduces a novel unpaired learning framework for generating detailed, pose-dependent 3D human meshes with clothing and textures, leveraging both 3D scans and 2D images to overcome data limitations.

Contribution

The paper proposes an unpaired learning approach that combines 3D scan data and 2D images to generate pose-dependent clothed human meshes with textures, addressing data scarcity issues.

Findings

01

Effective learning from limited 3D scans and large 2D image datasets.

02

Achieves realistic, pose-dependent human mesh generation with clothing and textures.

03

Outperforms existing 3D human body generative models.

Abstract

We present SCULPT, a novel 3D generative model for clothed and textured 3D meshes of humans. Specifically, we devise a deep neural network that learns to represent the geometry and appearance distribution of clothed human bodies. Training such a model is challenging, as datasets of textured 3D meshes for humans are limited in size and accessibility. Our key observation is that there exist medium-sized 3D scan datasets like CAPE, as well as large-scale 2D image datasets of clothed humans and multiple appearances can be mapped to a single geometry. To effectively learn from the two data modalities, we propose an unpaired learning procedure for pose-dependent clothed and textured human meshes. Specifically, we learn a pose-dependent geometry space from 3D scan data. We represent this as per vertex displacements w.r.t. the SMPL model. Next, we train a geometry conditioned texture generator…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition

MethodsBLIP: Bootstrapping Language-Image Pre-training · Contrastive Language-Image Pre-training