SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
Jianan Zhen, Qi Fang, Jiaming Sun, Wentao Liu, Wei Jiang, Hujun Bao,, Xiaowei Zhou

TL;DR
This paper introduces SMAP, a novel single-shot bottom-up method for multi-person 3D pose estimation from a single RGB image that leverages 2.5D body part representations to improve accuracy and contextual reasoning.
Contribution
The paper proposes a new bottom-up approach that regresses 2.5D body part representations and reconstructs 3D poses, outperforming previous top-down methods.
Findings
Achieves state-of-the-art results on CMU Panoptic and MuPoTS-3D datasets.
Effectively models inter-person depth relationships.
Applicable to in-the-wild videos.
Abstract
Recovering multi-person 3D poses with absolute scales from a single RGB image is a challenging problem due to the inherent depth and scale ambiguity from a single view. Addressing this ambiguity requires to aggregate various cues over the entire image, such as body sizes, scene layouts, and inter-person relationships. However, most previous methods adopt a top-down scheme that first performs 2D pose detection and then regresses the 3D pose and scale for each detected person individually, ignoring global contextual cues. In this paper, we propose a novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.5D representations with a depth-aware part association algorithm. Such a single-shot bottom-up scheme allows the system to better learn and reason about the inter-person depth relationship, improving both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Video Surveillance and Tracking Methods
