SAT: Supervisor Regularization and Animation Augmentation for Two-process Monocular Texture 3D Human Reconstruction

Gangjian Zhang; Jian Shu; Nanjie Yao; Hao Wang

arXiv:2508.19688·cs.CV·August 28, 2025

SAT: Supervisor Regularization and Animation Augmentation for Two-process Monocular Texture 3D Human Reconstruction

Gangjian Zhang, Jian Shu, Nanjie Yao, Hao Wang

PDF

TL;DR

This paper introduces SAT, a novel framework for monocular 3D human reconstruction that effectively integrates geometric priors and augments training data online, resulting in higher quality textured 3D avatars from single images.

Contribution

The paper proposes a two-process reconstruction framework with supervisor feature regularization and online animation augmentation to improve geometric learning and data diversity.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Effectively fuses multiple geometric priors for consistent 3D reconstruction.

03

Enhances training data with online animation augmentation for better model generalization.

Abstract

Monocular texture 3D human reconstruction aims to create a complete 3D digital avatar from just a single front-view human RGB image. However, the geometric ambiguity inherent in a single 2D image and the scarcity of 3D human training data are the main obstacles limiting progress in this field. To address these issues, current methods employ prior geometric estimation networks to derive various human geometric forms, such as the SMPL model and normal maps. However, they struggle to integrate these modalities effectively, leading to view inconsistencies, such as facial distortions. To this end, we propose a two-process 3D human reconstruction framework, SAT, which seamlessly learns various prior geometries in a unified manner and reconstructs high-quality textured 3D avatars as the final output. To further facilitate geometry learning, we introduce a Supervisor Feature Regularization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.