GART: Gaussian Articulated Template Models
Jiahui Lei, Yufu Wang, Georgios Pavlakos, Lingjie Liu and, Kostas Daniilidis

TL;DR
GART is a novel model that uses a mixture of 3D Gaussians and a template prior to efficiently capture, reconstruct, and render non-rigid articulated subjects from monocular videos in real-time.
Contribution
It introduces GART, a new explicit, expressive, and efficient representation combining Gaussian mixtures with template priors for non-rigid articulated object modeling.
Findings
Reconstructs non-rigid subjects from monocular videos in seconds to minutes.
Renders in novel poses at over 150 frames per second.
Generalizes to complex deformations with learnable latent bones.
Abstract
We introduce Gaussian Articulated Template Model GART, an explicit, efficient, and expressive representation for non-rigid articulated subject capturing and rendering from monocular videos. GART utilizes a mixture of moving 3D Gaussians to explicitly approximate a deformable subject's geometry and appearance. It takes advantage of a categorical template model prior (SMPL, SMAL, etc.) with learnable forward skinning while further generalizing to more complex non-rigid deformations with novel latent bones. GART can be reconstructed via differentiable rendering from monocular videos in seconds or minutes and rendered in novel poses faster than 150fps.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Face recognition and analysis · Human Motion and Animation
