Learning Variational Motion Prior for Video-based Motion Capture

Xin Chen; Zhuo Su; Lingbo Yang; Pei Cheng; Lan Xu; Bin Fu; and Gang Yu

arXiv:2210.15134·cs.CV·October 31, 2022·5 cites

Learning Variational Motion Prior for Video-based Motion Capture

Xin Chen, Zhuo Su, Lingbo Yang, Pei Cheng, Lan Xu, Bin Fu, and Gang Yu

PDF

Open Access

TL;DR

This paper introduces a variational motion prior framework using a transformer-based autoencoder to improve video-based motion capture, especially in challenging scenarios involving occlusion and complex poses, enabling real-time and stable motion estimation.

Contribution

We propose a novel variational motion prior model with a transformer-based autoencoder and style-mapping, enhancing generalization and real-time performance in video-based motion capture.

Findings

01

Reduces temporal jittering and failure modes in pose estimation

02

Achieves real-time motion capture during inference

03

Demonstrates superior performance on public and in-the-wild datasets

Abstract

Motion capture from a monocular video is fundamental and crucial for us humans to naturally experience and interact with each other in Virtual Reality (VR) and Augmented Reality (AR). However, existing methods still struggle with challenging cases involving self-occlusion and complex poses due to the lack of effective motion prior modeling. In this paper, we present a novel variational motion prior (VMP) learning approach for video-based motion capture to resolve the above issue. Instead of directly building the correspondence between the video and motion domain, We propose to learn a generic latent space for capturing the prior distribution of all natural motions, which serve as the basis for subsequent video-based motion capture tasks. To improve the generalization capacity of prior space, we propose a transformer-based variational autoencoder pretrained over marker-based 3D mocap…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Human Motion and Animation

MethodsTemporal Jittering