Single-shot 3D multi-person pose estimation in complex images
Abdallah Benzine, Bertrand Luvison, Quoc Cuong Pham, Catherine Achard

TL;DR
This paper introduces a novel single-shot approach for multi-person 3D human pose estimation in complex images, capable of handling multiple people without bounding boxes and robust to occlusions.
Contribution
It extends the Stacked Hourglass Network with multi-scale features and employs associative embedding for joint grouping, enabling accurate 3D pose estimation for multiple people in complex scenes.
Findings
Outperforms state-of-the-art on CMU Panoptic and MuPoTS-3D datasets.
Achieves good results on complex synthetic images from the JTA Dataset.
Handles variable number of people and occlusions without bounding boxes.
Abstract
In this paper, we propose a new single shot method for multi-person 3D human pose estimation in complex images. The model jointly learns to locate the human joints in the image, to estimate their 3D coordinates and to group these predictions into full human skeletons. The proposed method deals with a variable number of people and does not need bounding boxes to estimate the 3D poses. It leverages and extends the Stacked Hourglass Network and its multi-scale feature learning to manage multi-person situations. Thus, we exploit a robust 3D human pose formulation to fully describe several 3D human poses even in case of strong occlusions or crops. Then, joint grouping and human pose estimation for an arbitrary number of people are performed using the associative embedding method. Our approach significantly outperforms the state of the art on the challenging CMU Panoptic and a previous single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Residual Connection · Convolution · Hourglass Module · Stacked Hourglass Network
