Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D   Human Pose Estimation

Zhe Wang; Daeyun Shin; Charless C. Fowlkes

arXiv:2004.03143·cs.CV·April 8, 2020·6 cites

Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D Human Pose Estimation

Zhe Wang, Daeyun Shin, Charless C. Fowlkes

PDF

Open Access

TL;DR

This paper investigates how predicting camera viewpoints during training enhances the ability of 3D human pose estimation models to generalize across different datasets, addressing dataset bias issues.

Contribution

It introduces the auxiliary task of camera viewpoint prediction, which improves cross-dataset generalization in 3D human pose estimation models.

Findings

01

Joint viewpoint and pose prediction improves generalization

02

Models trained with viewpoint prediction outperform baselines

03

Systematic dataset biases affect model performance

Abstract

Monocular estimation of 3d human pose has attracted increased attention with the availability of large ground-truth motion capture datasets. However, the diversity of training data available is limited and it is not clear to what extent methods generalize outside the specific datasets they are trained on. In this work we carry out a systematic study of the diversity and biases present in specific datasets and its effect on cross-dataset generalization across a compendium of 5 pose datasets. We specifically focus on systematic differences in the distribution of camera viewpoints relative to a body-centered coordinate frame. Based on this observation, we propose an auxiliary task of predicting the camera viewpoint in addition to pose. We find that models trained to jointly predict viewpoint and pose systematically show significantly improved cross-dataset generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Human Motion and Animation