ShapeGaussian: High-Fidelity 4D Human Reconstruction in Monocular Videos via Vision Priors

Zhenxiao Liang; Ning Zhang; Youbao Tang; Ruei-Sung Lin; Qixing Huang; Peng Chang; Jing Xiao

arXiv:2602.05572·cs.CV·February 6, 2026

ShapeGaussian: High-Fidelity 4D Human Reconstruction in Monocular Videos via Vision Priors

Zhenxiao Liang, Ning Zhang, Youbao Tang, Ruei-Sung Lin, Qixing Huang, Peng Chang, Jing Xiao

PDF

Open Access

TL;DR

ShapeGaussian is a novel method for high-fidelity 4D human reconstruction from monocular videos that combines vision priors with a two-step process to improve accuracy and robustness over existing template-based and template-free approaches.

Contribution

It introduces a template-free approach that integrates vision priors and a neural deformation model for superior 4D human reconstruction from monocular videos.

Findings

01

Outperforms template-based methods in accuracy and visual quality.

02

Effectively mitigates pose estimation errors using vision priors.

03

Handles diverse human motions robustly in casual videos.

Abstract

We introduce ShapeGaussian, a high-fidelity, template-free method for 4D human reconstruction from casual monocular videos. Generic reconstruction methods lacking robust vision priors, such as 4DGS, struggle to capture high-deformation human motion without multi-view cues. While template-based approaches, primarily relying on SMPL, such as HUGS, can produce photorealistic results, they are highly susceptible to errors in human pose estimation, often leading to unrealistic artifacts. In contrast, ShapeGaussian effectively integrates template-free vision priors to achieve both high-fidelity and robust scene reconstructions. Our method follows a two-step pipeline: first, we learn a coarse, deformable geometry using pretrained models that estimate data-driven priors, providing a foundation for reconstruction. Then, we refine this geometry using a neural deformation model to capture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Robot Manipulation and Learning · 3D Shape Modeling and Analysis