End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth   Data

Meysam Madadi; Sergio Escalera; Xavier Baro; Jordi Gonzalez

arXiv:1705.09606·cs.CV·April 13, 2018·46 cites

End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth Data

Meysam Madadi, Sergio Escalera, Xavier Baro, Jordi Gonzalez

PDF

Open Access

TL;DR

This paper introduces a hierarchical CNN architecture that specializes in local hand joint subsets and fuses their features to improve 3D hand pose estimation from depth data, achieving state-of-the-art results.

Contribution

A novel tree-structured CNN with local joint specialization and end-to-end fusion for enhanced hand pose recovery from depth images.

Findings

01

Outperforms state-of-the-art on NYU and SyntheticHand datasets.

02

Incorporates appearance and physical constraints into the loss function.

03

Uses non-rigid data augmentation to improve training data diversity.

Abstract

Despite recent advances in 3D pose estimation of human hands, especially thanks to the advent of CNNs and depth cameras, this task is still far from being solved. This is mainly due to the highly non-linear dynamics of fingers, which make hand model training a challenging task. In this paper, we exploit a novel hierarchical tree-like structured CNN, in which branches are trained to become specialized in predefined subsets of hand joints, called local poses. We further fuse local pose features, extracted from hierarchical CNN branches, to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motion and deformation. Finally, we introduce a non-rigid data augmentation approach to increase the amount of training depth data. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Robot Manipulation and Learning