MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

Gr\'egory Rogez; Cordelia Schmid

arXiv:1607.02046·cs.CV·October 31, 2016·195 cites

MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

Gr\'egory Rogez, Cordelia Schmid

PDF

Open Access

TL;DR

This paper introduces a novel image synthesis engine that augments real datasets with photorealistic images generated from 3D MoCap data, significantly improving 3D human pose estimation in wild images.

Contribution

The authors propose a new MoCap-guided image synthesis method to create large, diverse training datasets for CNNs, enhancing 3D pose estimation accuracy in real-world scenarios.

Findings

01

Outperforms state-of-the-art on Human3.6M dataset

02

Shows promising results on in-the-wild LSP dataset

03

Demonstrates CNN generalization from synthetic to real images

Abstract

This paper addresses the problem of 3D human pose estimation in the wild. A significant challenge is the lack of training data, i.e., 2D images of humans annotated with 3D poses. Such data is necessary to train state-of-the-art CNN architectures. Here, we propose a solution to generate a large set of photorealistic synthetic images of humans with 3D pose annotations. We introduce an image-based synthesis engine that artificially augments a dataset of real images with 2D human pose annotations using 3D Motion Capture (MoCap) data. Given a candidate 3D pose our algorithm selects for each joint an image whose 2D pose locally matches the projected 3D pose. The selected images are then combined to generate a new synthetic image by stitching local image patches in a kinematically constrained manner. The resulting images are used to train an end-to-end CNN for full-body 3D pose estimation. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Video Surveillance and Tracking Methods