3D Hand Pose and Shape Estimation from RGB Images for Keypoint-Based   Hand Gesture Recognition

Danilo Avola; Luigi Cinque; Alessio Fagioli; Gian Luca Foresti,; Adriano Fragomeni; Daniele Pannone

arXiv:2109.13879·cs.CV·May 10, 2022

3D Hand Pose and Shape Estimation from RGB Images for Keypoint-Based Hand Gesture Recognition

Danilo Avola, Luigi Cinque, Alessio Fagioli, Gian Luca Foresti,, Adriano Fragomeni, Daniele Pannone

PDF

TL;DR

This paper introduces a robust end-to-end framework for 3D hand pose and shape estimation from RGB images, significantly improving stability and accuracy for hand gesture recognition tasks.

Contribution

The paper presents a novel keypoint-based end-to-end system that enhances 3D hand estimation stability and accuracy using multi-task learning and a viewpoint encoder.

Findings

01

Achieved state-of-the-art results on 3D hand pose and shape estimation benchmarks.

02

Outperformed existing keypoint-based methods in hand gesture recognition datasets.

03

Demonstrated robustness and stability in real-life scenarios.

Abstract

Estimating the 3D pose of a hand from a 2D image is a well-studied problem and a requirement for several real-life applications such as virtual reality, augmented reality, and hand gesture recognition. Currently, reasonable estimations can be computed from single RGB images, especially when a multi-task learning approach is used to force the system to consider the shape of the hand when its pose is determined. However, depending on the method used to represent the hand, the performance can drop considerably in real-life tasks, suggesting that stable descriptions are required to achieve satisfactory results. In this paper, we present a keypoint-based end-to-end framework for 3D hand and pose estimation and successfully apply it to the task of hand gesture recognition as a study case. Specifically, after a pre-processing step in which the images are normalized, the proposed pipeline uses…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.