Better Rigs, Not Bigger Networks: A Body Model Ablation for Gaussian Avatars
Derek Austin

TL;DR
Replacing complex body models with a minimal rig based on MHR and SAM-3D-Body achieves high visual fidelity in 3D Gaussian avatar reconstruction with simpler architecture.
Contribution
Demonstrates that a minimal pipeline using MHR and SAM-3D-Body surpasses complex models in avatar reconstruction, emphasizing body model simplicity.
Findings
Minimal pipeline achieves highest PSNR on benchmark datasets.
Body model expressiveness significantly impacts avatar reconstruction quality.
Simpler models can outperform complex architectures in 3D avatar tasks.
Abstract
Recent 3D Gaussian splatting methods built atop SMPL achieve remarkable visual fidelity while continually increasing the complexity of the overall training architecture. We demonstrate that much of this complexity is unnecessary: by replacing SMPL with the Momentum Human Rig (MHR), estimated via SAM-3D-Body, a minimal pipeline with no learned deformations or pose-dependent corrections achieves the highest reported PSNR and competitive or superior LPIPS and SSIM on PeopleSnapshot and ZJU-MoCap. To disentangle pose estimation quality from body model representational capacity, we perform two controlled ablations: translating SAM-3D-Body meshes to SMPL-X, and translating the original dataset's SMPL poses into MHR both retrained under identical conditions. These ablations confirm that body model expressiveness has been a primary bottleneck in avatar reconstruction, with both mesh…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
