TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
Yangyi Huang, Hongwei Yi, Yuliang Xiu, Tingting Liao, Jiaxiang Tang,, Deng Cai, Justus Thies

TL;DR
TeCH introduces a novel method for reconstructing detailed 3D clothed humans from a single image by leveraging text prompts, personalized diffusion models, and hybrid 3D representations, achieving high fidelity and accuracy.
Contribution
It proposes a new approach combining text-guided prompts, personalized diffusion models, and hybrid 3D representations for detailed single-image human reconstruction.
Findings
Outperforms state-of-the-art in accuracy and quality
Produces high-fidelity, detailed 3D clothed humans
Effectively captures unseen regions with high-level details
Abstract
Despite recent research advancements in reconstructing clothed humans from a single image, accurately restoring the "unseen regions" with high-level details remains an unsolved challenge that lacks attention. Existing methods often generate overly smooth back-side surfaces with a blurry texture. But how to effectively capture all visual attributes of an individual from a single image, which are sufficient to reconstruct unseen areas (e.g., the back view)? Motivated by the power of foundation models, TeCH reconstructs the 3D human by leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles) which are automatically generated via a garment parsing model and Visual Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion model (T2I) which learns the "indescribable" appearance. To represent high-resolution 3D clothed humans at an affordable cost, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
