Learning to Estimate 3D Human Pose and Shape from a Single Color Image
Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, Kostas Daniilidis

TL;DR
This paper introduces a ConvNet-based method for estimating detailed 3D human pose and shape from a single image by integrating a parametric body model, enabling accurate predictions with limited training data and iterative optimization.
Contribution
It presents an end-to-end framework that predicts 3D human shape using 2D keypoints and masks, incorporating a differentiable renderer for refinement, reducing reliance on 3D ground truth data.
Findings
Outperforms previous methods on 3D human shape estimation
Uses only 2D keypoints and masks for reliable parameter prediction
Employs a differentiable renderer for improved accuracy
Abstract
This work addresses the problem of estimating the full body 3D human pose and shape from a single color image. This is a task where iterative optimization-based solutions have typically prevailed, while Convolutional Networks (ConvNets) have suffered because of the lack of training data and their low resolution 3D predictions. Our work aims to bridge this gap and proposes an efficient and effective direct prediction method based on ConvNets. Central part to our approach is the incorporation of a parametric statistical body shape model (SMPL) within our end-to-end framework. This allows us to get very detailed 3D mesh results, while requiring estimation only of a small number of parameters, making it friendly for direct network prediction. Interestingly, we demonstrate that these parameters can be predicted reliably only from 2D keypoints and masks. These are typical outputs of generic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Advanced Vision and Imaging
