ProxyCap: Real-time Monocular Full-body Capture in World Space via   Human-Centric Proxy-to-Motion Learning

Yuxiang Zhang; Hongwen Zhang; Liangxiao Hu; Jiajun Zhang; Hongwei Yi,; Shengping Zhang; Yebin Liu

arXiv:2307.01200·cs.CV·December 27, 2023

ProxyCap: Real-time Monocular Full-body Capture in World Space via Human-Centric Proxy-to-Motion Learning

Yuxiang Zhang, Hongwen Zhang, Liangxiao Hu, Jiajun Zhang, Hongwei Yi,, Shengping Zhang, Yebin Liu

PDF

Open Access

TL;DR

ProxyCap introduces a real-time monocular full-body capture system that leverages human-centric proxy-to-motion learning, enabling accurate and physically plausible world-space motion estimation from monocular videos with moving cameras.

Contribution

It presents a novel human-centric proxy-to-motion learning scheme and a contact-aware neural motion descent module for real-time, accurate world-space full-body capture.

Findings

01

First real-time monocular full-body capture with foot-ground contact.

02

Achieves accurate world-space motion estimation from monocular videos.

03

Demonstrates robustness with hand-held moving cameras.

Abstract

Learning-based approaches to monocular motion capture have recently shown promising results by learning to regress in a data-driven manner. However, due to the challenges in data collection and network designs, it remains challenging for existing solutions to achieve real-time full-body capture while being accurate in world space. In this work, we introduce ProxyCap, a human-centric proxy-to-motion learning scheme to learn world-space motions from a proxy dataset of 2D skeleton sequences and 3D rotational motions. Such proxy data enables us to build a learning-based network with accurate world-space supervision while also mitigating the generalization issues. For more accurate and physically plausible predictions in world space, our network is designed to learn human motions from a human-centric perspective, which enables the understanding of the same motion captured with different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Video Surveillance and Tracking Methods

MethodsAttentive Walk-Aggregating Graph Neural Network