Towards Viewpoint Invariant 3D Human Pose Estimation

Albert Haque; Boya Peng; Zelun Luo; Alexandre Alahi; Serena Yeung; Li; Fei-Fei

arXiv:1603.07076·cs.CV·July 27, 2016

Towards Viewpoint Invariant 3D Human Pose Estimation

Albert Haque, Boya Peng, Zelun Luo, Alexandre Alahi, Serena Yeung, Li, Fei-Fei

PDF

Open Access 2 Repos

TL;DR

This paper introduces a viewpoint invariant model for 3D human pose estimation from depth images, capable of handling occlusion and noise, and performs well across diverse viewpoints.

Contribution

It presents a novel multi-task learning framework with a convolutional-recurrent architecture and error feedback for viewpoint invariant 3D pose estimation.

Findings

01

Achieves state-of-the-art performance on non-frontal viewpoints.

02

Performs competitively on frontal views.

03

Effectively handles occlusion and noise in depth images.

Abstract

We propose a viewpoint invariant model for 3D human pose estimation from a single depth image. To achieve this, our discriminative model embeds local regions into a learned viewpoint invariant feature space. Formulated as a multi-task learning problem, our model is able to selectively predict partial poses in the presence of noise and occlusion. Our approach leverages a convolutional and recurrent network architecture with a top-down error feedback mechanism to self-correct previous pose estimates in an end-to-end manner. We evaluate our model on a previously published depth dataset and a newly collected human pose dataset containing 100K annotated depth images from extreme viewpoints. Experiments show that our model achieves competitive performance on frontal views while achieving state-of-the-art performance on alternate viewpoints.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging