Hand Pose Estimation via Latent 2.5D Heatmap Regression
Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz

TL;DR
This paper introduces a novel 2.5D heatmap regression method for estimating 3D hand pose from a single RGB image, achieving state-of-the-art results even with occlusions.
Contribution
The paper proposes a new 2.5D pose representation and a CNN architecture that implicitly learns depth and heatmaps for monocular 3D hand pose estimation.
Findings
Achieves state-of-the-art 2D and 3D hand pose estimation results.
Handles severe occlusions effectively.
Estimates pose up to a scaling factor with optional hand size prior.
Abstract
Estimating the 3D pose of a hand is an essential part of human-computer interaction. Estimating 3D pose using depth or multi-view sensors has become easier with recent advances in computer vision, however, regressing pose from a single RGB image is much less straightforward. The main difficulty arises from the fact that 3D pose requires some form of depth estimates, which are ambiguous given only an RGB image. In this paper we propose a new method for 3D hand pose estimation from a monocular image through a novel 2.5D pose representation. Our new representation estimates pose up to a scaling factor, which can be estimated additionally if a prior of the hand size is given. We implicitly learn depth maps and heatmap distributions with a novel CNN architecture. Our system achieves the state-of-the-art estimation of 2D and 3D hand pose on several challenging datasets in presence of severe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Robot Manipulation and Learning
MethodsHeatmap
