GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data

Wentao Wang; Hang Ye; Fangzhou Hong; Xue Yang; Jianfu Zhang; Yizhou Wang; Ziwei Liu; Liang Pan

arXiv:2411.18624·cs.CV·January 27, 2026

GeneMAN: Generalizable Single-Image 3D Human Reconstruction from Multi-Source Human Data

Wentao Wang, Hang Ye, Fangzhou Hong, Xue Yang, Jianfu Zhang, Yizhou Wang, Ziwei Liu, Liang Pan

PDF

Open Access 1 Datasets

TL;DR

GeneMAN is a novel framework that reconstructs high-fidelity 3D human models from single in-the-wild images by leveraging multi-source data and advanced diffusion models, overcoming challenges like diverse body proportions and ambiguous textures.

Contribution

It introduces a multi-source data collection and a diffusion-model-based approach for generalizable 3D human reconstruction without relying on parametric models.

Findings

01

Outperforms prior methods in quality of 3D reconstructions.

02

Demonstrates strong generalization to in-the-wild images.

03

Produces high-quality textures and geometries in diverse scenarios.

Abstract

Given a single in-the-wild human photo, it remains a challenging task to reconstruct a high-fidelity 3D human model. Existing methods face difficulties including a) the varying body proportions captured by in-the-wild human images; b) diverse personal belongings within the shot; and c) ambiguities in human postures and inconsistency in human textures. In addition, the scarcity of high-quality human data intensifies the challenge. To address these problems, we propose a Generalizable image-to-3D huMAN reconstruction framework, dubbed GeneMAN, building upon a comprehensive multi-source collection of high-quality human data, including 3D scans, multi-view videos, single photos, and our generated synthetic human data. GeneMAN encompasses three key modules. 1) Without relying on parametric human models (e.g., SMPL), GeneMAN first trains a human-specific text-to-image diffusion model and a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

wwt117/GeneMAN
dataset· 36 dl
36 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Advanced Vision and Imaging

MethodsDiffusion