RemoCap: Disentangled Representation Learning for Motion Capture
Hongsheng Wang, Lizao Zhang, Zhangnan Zhong, Shuolin Xu, Xinrui Zhou,, Shengyu Zhang, Huahao Xu, Fei Wu, Feng Lin

TL;DR
RemoCap introduces a novel disentangled representation learning approach for 3D human body reconstruction from motion sequences, effectively handling occlusions and improving accuracy through spatial and motion disentanglement techniques.
Contribution
It proposes Spatial Disentanglement and Motion Disentanglement modules, along with a sequence velocity loss, to enhance occlusion handling and temporal coherence in 3D human reconstruction.
Findings
Outperforms state-of-the-art methods on benchmark datasets.
Achieves best MPVPE, MPJPE, and PA-MPJPE metrics on 3DPW.
Demonstrates superior occlusion handling and motion fidelity.
Abstract
Reconstructing 3D human bodies from realistic motion sequences remains a challenge due to pervasive and complex occlusions. Current methods struggle to capture the dynamics of occluded body parts, leading to model penetration and distorted motion. RemoCap leverages Spatial Disentanglement (SD) and Motion Disentanglement (MD) to overcome these limitations. SD addresses occlusion interference between the target human body and surrounding objects. It achieves this by disentangling target features along the dimension axis. By aligning features based on their spatial positions in each dimension, SD isolates the target object's response within a global window, enabling accurate capture despite occlusions. The MD module employs a channel-wise temporal shuffling strategy to simulate diverse scene dynamics. This process effectively disentangles motion features, allowing RemoCap to reconstruct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis
MethodsContact α Live Humαη αt WestJet αirliηes: α Brief Guide
