A Dual-Masked Auto-Encoder for Robust Motion Capture with   Spatial-Temporal Skeletal Token Completion

Junkun Jiang; Jie Chen; Yike Guo

arXiv:2207.07381·cs.CV·July 18, 2022·1 cites

A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion

Junkun Jiang, Jie Chen, Yike Guo

PDF

Open Access 1 Repo

TL;DR

This paper introduces a dual-masked auto-encoder framework utilizing transformer-based skeletal motion modeling to improve robustness in multi-person motion capture, especially under severe occlusion and data loss.

Contribution

It proposes an adaptive, identity-aware triangulation and a novel Dual-Masked Auto-Encoder for complete 3D skeletal motion reconstruction under challenging conditions.

Findings

01

Outperforms state-of-the-art methods on benchmark datasets.

02

Effective in scenarios with severe occlusion and missing data.

03

Introduces a new high-accuracy multi-person motion capture dataset.

Abstract

Multi-person motion capture can be challenging due to ambiguities caused by severe occlusion, fast body movement, and complex interactions. Existing frameworks build on 2D pose estimations and triangulate to 3D coordinates via reasoning the appearance, trajectory, and geometric consistencies among multi-camera observations. However, 2D joint detection is usually incomplete and with wrong identity assignments due to limited observation angle, which leads to noisy 3D triangulation results. To overcome this issue, we propose to explore the short-range autoregressive characteristics of skeletal motion using transformer. First, we propose an adaptive, identity-aware triangulation module to reconstruct 3D joints and identify the missing joints for each identity. To generate complete 3D skeletal motion, we then propose a Dual-Masked Auto-Encoder (D-MAE) which encodes the joint status with both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hkbu-vscomputing/2022_mm_dmae-mocap
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Video Analysis and Summarization