TL;DR
This paper introduces a novel dynamic surface function network that enables temporally coherent reconstruction and animation of clothed human bodies from monocular RGB-D sequences, leveraging self-supervised learning and differentiable rasterization.
Contribution
It proposes a person-specific, pose-conditioned surface model using a multi-layer perceptron embedded in SMPL space, learned in a self-supervised manner for coherent 3D reconstruction and animation.
Findings
Achieves temporally coherent mesh sequences from monocular RGB-D data.
Enables synthesis of new animations with pose-dependent deformations.
Uses self-supervised learning with differentiable rasterization for surface modeling.
Abstract
We present a novel method for temporal coherent reconstruction and tracking of clothed humans. Given a monocular RGB-D sequence, we learn a person-specific body model which is based on a dynamic surface function network. To this end, we explicitly model the surface of the person using a multi-layer perceptron (MLP) which is embedded into the canonical space of the SMPL body model. With classical forward rendering, the represented surface can be rasterized using the topology of a template mesh. For each surface point of the template mesh, the MLP is evaluated to predict the actual surface location. To handle pose-dependent deformations, the MLP is conditioned on the SMPL pose parameters. We show that this surface representation as well as the pose parameters can be learned in a self-supervised fashion using the principle of analysis-by-synthesis and differentiable rasterization. As a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
