TL;DR
This paper introduces a real-time, feed-forward Gaussian splatting framework for human 3D reconstruction and animation from multi-view images, enabling efficient and interactive human avatar creation.
Contribution
It presents a novel approach that predicts Gaussian primitives associated with SMPL-X vertices, allowing real-time animation without repeated inference.
Findings
Achieves reconstruction quality comparable to state-of-the-art methods.
Supports real-time animation and interactive applications.
Operates efficiently with a single forward pass.
Abstract
We present a generalizable feed-forward Gaussian splatting framework for human 3D reconstruction and real-time animation that operates directly on multi-view RGB images and their associated SMPL-X poses. Unlike prior methods that rely on depth supervision, fixed input views, UV map, or repeated feed-forward inference for each target view or pose, our approach predicts, in a canonical pose, a set of 3D Gaussian primitives associated with each SMPL-X vertex. One Gaussian is regularized to remain close to the SMPL-X surface, providing a strong geometric prior and stable correspondence to the parametric body model, while an additional small set of unconstrained Gaussians per vertex allows the representation to capture geometric structures that deviate from the parametric surface, such as clothing and hair. In contrast to recent approaches such as HumanRAM, which require repeated network…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
