Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping
Tianhao Wu, Jing Yang, Zhilin Guo, Jingyi Wan, Fangcheng Zhong, Cengiz, Oztireli

TL;DR
This paper introduces Gaussian Head & Shoulders, a novel neural avatar method that combines 3D Gaussian splatting with neural textures and anchor-guided warping to achieve high-fidelity, full-body avatar reconstruction from casual videos at real-time speeds.
Contribution
It proposes a new approach integrating Gaussian splatting with neural texture warping and anchor Gaussians, enabling detailed and robust full upper-body avatar reconstruction.
Findings
Achieves high-fidelity head and upper body reconstruction.
Operates at around 130 FPS without MLP queries.
Outperforms existing methods in quality and robustness.
Abstract
By equipping the most recent 3D Gaussian Splatting representation with head 3D morphable models (3DMM), existing methods manage to create head avatars with high fidelity. However, most existing methods only reconstruct a head without the body, substantially limiting their application scenarios. We found that naively applying Gaussians to model the clothed chest and shoulders tends to result in blurry reconstruction and noisy floaters under novel poses. This is because of the fundamental limitation of Gaussians and point clouds -- each Gaussian or point can only have a single directional radiance without spatial variance, therefore an unnecessarily large number of them is required to represent complicated spatially varying texture, even for simple geometry. In contrast, we propose to model the body part with a neural texture that consists of coarse and pose-dependent fine colors. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · 3D Shape Modeling and Analysis
MethodsSparse Evolutionary Training · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
