Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Georgios Pavlakos; Luyang Zhu; Xiaowei Zhou; Kostas Daniilidis

arXiv:1805.04092·cs.CV·May 11, 2018·22 cites

Learning to Estimate 3D Human Pose and Shape from a Single Color Image

Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

PDF

Open Access

TL;DR

This paper introduces a ConvNet-based method for estimating detailed 3D human pose and shape from a single image by integrating a parametric body model, enabling accurate predictions with limited training data and iterative optimization.

Contribution

It presents an end-to-end framework that predicts 3D human shape using 2D keypoints and masks, incorporating a differentiable renderer for refinement, reducing reliance on 3D ground truth data.

Findings

01

Outperforms previous methods on 3D human shape estimation

02

Uses only 2D keypoints and masks for reliable parameter prediction

03

Employs a differentiable renderer for improved accuracy

Abstract

This work addresses the problem of estimating the full body 3D human pose and shape from a single color image. This is a task where iterative optimization-based solutions have typically prevailed, while Convolutional Networks (ConvNets) have suffered because of the lack of training data and their low resolution 3D predictions. Our work aims to bridge this gap and proposes an efficient and effective direct prediction method based on ConvNets. Central part to our approach is the incorporation of a parametric statistical body shape model (SMPL) within our end-to-end framework. This allows us to get very detailed 3D mesh results, while requiring estimation only of a small number of parameters, making it friendly for direct network prediction. Interestingly, we demonstrate that these parameters can be predicted reliably only from 2D keypoints and masks. These are typical outputs of generic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging