SAT: Supervisor Regularization and Animation Augmentation for Two-process Monocular Texture 3D Human Reconstruction
Gangjian Zhang, Jian Shu, Nanjie Yao, Hao Wang

TL;DR
This paper introduces SAT, a novel framework for monocular 3D human reconstruction that effectively integrates geometric priors and augments training data online, resulting in higher quality textured 3D avatars from single images.
Contribution
The paper proposes a two-process reconstruction framework with supervisor feature regularization and online animation augmentation to improve geometric learning and data diversity.
Findings
Outperforms state-of-the-art methods on benchmark datasets.
Effectively fuses multiple geometric priors for consistent 3D reconstruction.
Enhances training data with online animation augmentation for better model generalization.
Abstract
Monocular texture 3D human reconstruction aims to create a complete 3D digital avatar from just a single front-view human RGB image. However, the geometric ambiguity inherent in a single 2D image and the scarcity of 3D human training data are the main obstacles limiting progress in this field. To address these issues, current methods employ prior geometric estimation networks to derive various human geometric forms, such as the SMPL model and normal maps. However, they struggle to integrate these modalities effectively, leading to view inconsistencies, such as facial distortions. To this end, we propose a two-process 3D human reconstruction framework, SAT, which seamlessly learns various prior geometries in a unified manner and reconstructs high-quality textured 3D avatars as the final output. To further facilitate geometry learning, we introduce a Supervisor Feature Regularization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
