Hand Pose Estimation through Semi-Supervised and Weakly-Supervised Learning
Natalia Neverova, Christian Wolf, Florian Nebout, Graham Taylor

TL;DR
This paper introduces a semi- and weakly-supervised deep learning approach for hand pose estimation that leverages synthetic and real datasets by using an intermediate segmentation representation to improve accuracy.
Contribution
It presents a novel training method that combines synthetic and real data through an intermediate segmentation, reducing domain shift and enhancing hand pose estimation accuracy.
Findings
Reduces joint estimation error by 15.7% on NYU dataset.
Utilizes an intermediate segmentation to bridge synthetic and real data domains.
Improves hand pose estimation accuracy with semi/weakly-supervised learning.
Abstract
We propose a method for hand pose estimation based on a deep regressor trained on two different kinds of input. Raw depth data is fused with an intermediate representation in the form of a segmentation of the hand into parts. This intermediate representation contains important topological information and provides useful cues for reasoning about joint locations. The mapping from raw depth to segmentation maps is learned in a semi/weakly-supervised way from two different datasets: (i) a synthetic dataset created through a rendering pipeline including densely labeled ground truth (pixelwise segmentations); and (ii) a dataset with real images for which ground truth joint positions are available, but not dense segmentations. Loss for training on real images is generated from a patch-wise restoration process, which aligns tentative segmentation maps with a large dictionary of synthetic poses.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
