A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose   Estimation from a Single Depth Image

Fu Xiong; Boshen Zhang; Yang Xiao; Zhiguo Cao; Taidong Yu; Joey Tianyi; Zhou; Junsong Yuan

arXiv:1908.09999·cs.CV·August 28, 2019·21 cites

A2J: Anchor-to-Joint Regression Network for 3D Articulated Pose Estimation from a Single Depth Image

Fu Xiong, Boshen Zhang, Yang Xiao, Zhiguo Cao, Taidong Yu, Joey Tianyi, Zhou, Junsong Yuan

PDF

Open Access 2 Repos

TL;DR

This paper introduces A2J, an anchor-to-joint regression network that estimates 3D hand and body poses from a single depth image using a novel anchor-based approach with high accuracy and speed.

Contribution

The paper proposes a new anchor-based 3D pose estimation method that outperforms existing encoder-decoder and 3D CNN approaches in accuracy and efficiency.

Findings

01

Achieves high accuracy on multiple datasets

02

Runs at around 100 FPS on a single GPU

03

Outperforms state-of-the-art methods in 3D pose estimation

Abstract

For 3D hand and body pose estimation task in depth image, a novel anchor-based approach termed Anchor-to-Joint regression network (A2J) with the end-to-end learning ability is proposed. Within A2J, anchor points able to capture global-local spatial context information are densely set on depth image as local regressors for the joints. They contribute to predict the positions of the joints in ensemble way to enhance generalization ability. The proposed 3D articulated pose estimation paradigm is different from the state-of-the-art encoder-decoder based FCN, 3D CNN and point-set based manners. To discover informative anchor points towards certain joint, anchor proposal procedure is also proposed for A2J. Meanwhile 2D CNN (i.e., ResNet-50) is used as backbone network to drive A2J, without using time-consuming 3D convolutional or deconvolutional layers. The experiments on 3 hand datasets and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Diabetic Foot Ulcer Assessment and Management

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Max Pooling · Convolution · Fully Convolutional Network