MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes
Zhi Li, Soshi Shimada, Bernt Schiele, Christian Theobalt and, Vladislav Golyanik

TL;DR
MoCapDeform is a novel framework that improves monocular 3D human motion capture by explicitly modeling non-rigid scene deformations, leading to better pose estimation and environment reconstruction in complex, deformable scenes.
Contribution
It introduces the first method to jointly estimate 3D human poses and non-rigid scene deformations from monocular RGB videos.
Findings
Achieves superior accuracy over existing methods.
Successfully models non-rigid scene deformations.
Performs well on new datasets with deformable backgrounds.
Abstract
3D human motion capture from monocular RGB images respecting interactions of a subject with complex and possibly deformable environments is a very challenging, ill-posed and under-explored problem. Existing methods address it only weakly and do not model possible surface deformations often occurring when humans interact with scene surfaces. In contrast, this paper proposes MoCapDeform, i.e., a new framework for monocular 3D human motion capture that is the first to explicitly model non-rigid deformations of a 3D scene for improved 3D human pose estimation and deformable environment reconstruction. MoCapDeform accepts a monocular RGB video and a 3D scene mesh aligned in the camera space. It first localises a subject in the input monocular video along with dense contact labels using a new raycasting based strategy. Next, our human-environment interaction constraints are leveraged to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Video Surveillance and Tracking Methods
