Temporal Consistency Loss for High Resolution Textured and Clothed 3DHuman Reconstruction from Monocular Video
Akin Caliskan, Armin Mustafa, Adrian Hilton

TL;DR
This paper introduces a new method for temporally consistent 3D reconstruction of clothed humans from monocular videos, improving accuracy, texture quality, and temporal stability over previous approaches.
Contribution
It proposes a novel temporal consistency loss and hybrid representation learning to enhance 3D human reconstruction from monocular video.
Findings
Significantly outperforms state-of-the-art methods in accuracy and consistency.
Improves texture prediction quality in 3D reconstructions.
Achieves better temporal stability in reconstructed sequences.
Abstract
We present a novel method to learn temporally consistent 3D reconstruction of clothed people from a monocular video. Recent methods for 3D human reconstruction from monocular video using volumetric, implicit or parametric human shape models, produce per frame reconstructions giving temporally inconsistent output and limited performance when applied to video. In this paper, we introduce an approach to learn temporally consistent features for textured reconstruction of clothed 3D human sequences from monocular video by proposing two advances: a novel temporal consistency loss function; and hybrid representation learning for implicit 3D reconstruction from 2D images and coarse 3D geometry. The proposed advances improve the temporal consistency and accuracy of both the 3D reconstruction and texture prediction from a monocular video. Comprehensive comparative performance evaluation on images…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
