JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network   for 3D Hand Pose Estimation from a Single Depth Image

Linpu Fang; Xingyan Liu; Li Liu; Hang Xu; and Wenxiong Kang

arXiv:2007.04646·cs.CV·July 13, 2020·5 cites

JGR-P2O: Joint Graph Reasoning based Pixel-to-Offset Prediction Network for 3D Hand Pose Estimation from a Single Depth Image

Linpu Fang, Xingyan Liu, Li Liu, Hang Xu, and Wenxiong Kang

PDF

Open Access 1 Repo

TL;DR

This paper introduces JGR-P2O, a novel pixel-wise prediction network with joint graph reasoning for accurate and efficient 3D hand pose estimation from a single depth image, outperforming existing methods.

Contribution

It proposes a GCN-based joint graph reasoning module combined with dense pixel offset prediction for end-to-end 3D hand pose estimation.

Findings

01

Achieves state-of-the-art accuracy on multiple benchmarks.

02

Runs at approximately 110fps on a single GPU.

03

Uses only 1.4 million parameters for efficiency.

Abstract

State-of-the-art single depth image-based 3D hand pose estimation methods are based on dense predictions, including voxel-to-voxel predictions, point-to-point regression, and pixel-wise estimations. Despite the good performance, those methods have a few issues in nature, such as the poor trade-off between accuracy and efficiency, and plain feature representation learning with local convolutions. In this paper, a novel pixel-wise prediction-based method is proposed to address the above issues. The key ideas are two-fold: a) explicitly modeling the dependencies among joints and the relations between the pixels and the joints for better local feature representation learning; b) unifying the dense pixel-wise offset predictions and direct joint regression for end-to-end training. Specifically, we first propose a graph convolutional network (GCN) based joint graph reasoning module to model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fanglinpu/JGR-P2O
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Anomaly Detection Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings