PSHuman: Photorealistic Single-image 3D Human Reconstruction using   Cross-Scale Multiview Diffusion and Explicit Remeshing

Peng Li; Wangguandong Zheng; Yuan Liu; Tao Yu; Yangguang Li; Xingqun; Qi; Xiaowei Chi; Siyu Xia; Yan-Pei Cao; Wei Xue; Wenhan Luo; Yike Guo

arXiv:2409.10141·cs.CV·March 25, 2025·2 cites

PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing

Peng Li, Wangguandong Zheng, Yuan Liu, Tao Yu, Yangguang Li, Xingqun, Qi, Xiaowei Chi, Siyu Xia, Yan-Pei Cao, Wei Xue, Wenhan Luo, Yike Guo

PDF

Open Access

TL;DR

PSHuman introduces a novel framework for photorealistic 3D human reconstruction from a single image, combining multiview diffusion, cross-scale modeling, and explicit remeshing to produce detailed, consistent, and realistic human meshes.

Contribution

The paper proposes a cross-scale diffusion approach conditioned on SMPL-X priors for improved single-view 3D human reconstruction, addressing geometric distortions and pose inconsistencies.

Findings

01

Outperforms existing methods in geometry detail and texture fidelity.

02

Achieves high-quality 3D human meshes with strong generalization.

03

Demonstrates effectiveness on CAPE and THuman2.1 datasets.

Abstract

Detailed and photorealistic 3D human modeling is essential for various applications and has seen tremendous progress. However, full-body reconstruction from a monocular RGB image remains challenging due to the ill-posed nature of the problem and sophisticated clothing topology with self-occlusions. In this paper, we propose PSHuman, a novel framework that explicitly reconstructs human meshes utilizing priors from the multiview diffusion model. It is found that directly applying multiview diffusion on single-view human images leads to severe geometric distortions, especially on generated faces. To address it, we propose a cross-scale diffusion that models the joint probability distribution of global full-body shape and local facial characteristics, enabling detailed and identity-preserved novel-view generation without any geometric distortion. Moreover, to enhance cross-view body shape…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion