Joint Training of a Convolutional Network and a Graphical Model for   Human Pose Estimation

Jonathan Tompson; Arjun Jain; Yann LeCun; Christoph Bregler

arXiv:1406.2984·cs.CV·September 19, 2014·975 cites

Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

Jonathan Tompson, Arjun Jain, Yann LeCun, Christoph Bregler

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hybrid deep learning and graphical model architecture for human pose estimation, leveraging structural constraints to improve accuracy in monocular images.

Contribution

It presents a novel joint training approach for combining convolutional networks with Markov Random Fields for pose estimation.

Findings

01

Significant performance improvement over existing methods

02

Effective exploitation of geometric constraints

03

Outperforms state-of-the-art techniques

Abstract

This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

max-andr/joint-cnn-mrf
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Video Surveillance and Tracking Methods

MethodsHeatmap