MPL: Lifting 3D Human Pose from Multi-view 2D Poses
Seyed Abolfazl Ghasemzadeh, Alexandre Alahi, Christophe De, Vleeschouwer

TL;DR
This paper introduces MPL, a transformer-based framework that lifts 3D human poses from multi-view 2D poses, effectively addressing occlusion issues and improving accuracy over triangulation methods.
Contribution
It proposes a novel transformer-based approach trained on synthetic data to improve 3D pose estimation from multi-view 2D poses, enhancing generalization to real-world scenarios.
Findings
Achieves up to 45% reduction in MPJPE errors compared to triangulation.
Effectively combines 2D pose estimation with 2D-to-3D lifting.
Demonstrates robustness in multi-view 3D human pose estimation.
Abstract
Estimating 3D human poses from 2D images is challenging due to occlusions and projective acquisition. Learning-based approaches have been largely studied to address this challenge, both in single and multi-view setups. These solutions however fail to generalize to real-world cases due to the lack of (multi-view) 'in-the-wild' images paired with 3D poses for training. For this reason, we propose combining 2D pose estimation, for which large and rich training datasets exist, and 2D-to-3D pose lifting, using a transformer-based network that can be trained from synthetic 2D-3D pose pairs. Our experiments demonstrate decreases up to 45% in MPJPE errors compared to the 3D pose obtained by triangulating the 2D poses. The framework's source code is available at https://github.com/aghasemzadeh/OpenMPL .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Hand Gesture Recognition Systems
