RemoCap: Disentangled Representation Learning for Motion Capture

Hongsheng Wang; Lizao Zhang; Zhangnan Zhong; Shuolin Xu; Xinrui Zhou,; Shengyu Zhang; Huahao Xu; Fei Wu; Feng Lin

arXiv:2405.12724·cs.CV·May 22, 2024

RemoCap: Disentangled Representation Learning for Motion Capture

Hongsheng Wang, Lizao Zhang, Zhangnan Zhong, Shuolin Xu, Xinrui Zhou,, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

PDF

Open Access

TL;DR

RemoCap introduces a novel disentangled representation learning approach for 3D human body reconstruction from motion sequences, effectively handling occlusions and improving accuracy through spatial and motion disentanglement techniques.

Contribution

It proposes Spatial Disentanglement and Motion Disentanglement modules, along with a sequence velocity loss, to enhance occlusion handling and temporal coherence in 3D human reconstruction.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Achieves best MPVPE, MPJPE, and PA-MPJPE metrics on 3DPW.

03

Demonstrates superior occlusion handling and motion fidelity.

Abstract

Reconstructing 3D human bodies from realistic motion sequences remains a challenge due to pervasive and complex occlusions. Current methods struggle to capture the dynamics of occluded body parts, leading to model penetration and distorted motion. RemoCap leverages Spatial Disentanglement (SD) and Motion Disentanglement (MD) to overcome these limitations. SD addresses occlusion interference between the target human body and surrounding objects. It achieves this by disentangling target features along the dimension axis. By aligning features based on their spatial positions in each dimension, SD isolates the target object's response within a global window, enabling accurate capture despite occlusions. The MD module employs a channel-wise temporal shuffling strategy to simulate diverse scene dynamics. This process effectively disentangles motion features, allowing RemoCap to reconstruct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis

MethodsContact α Live Humαη αt WestJet αirliηes: α Brief Guide