Recurrent Human Pose Estimation
Vasileios Belagiannis, Andrew Zisserman

TL;DR
This paper introduces a recurrent convolutional neural network for 2D human pose estimation that improves accuracy through iterative refinement, end-to-end training, and visibility prediction, achieving state-of-the-art results with a simpler architecture.
Contribution
It presents a novel recurrent architecture for pose estimation, trained end-to-end, capable of predicting keypoint visibility, and matching state-of-the-art performance without complex graphical models.
Findings
Achieves state-of-the-art accuracy on benchmark datasets.
Simplifies pose estimation architecture by removing graphical models.
Demonstrates effective visibility prediction for keypoints.
Abstract
We propose a novel ConvNet model for predicting 2D human body poses in an image. The model regresses a heatmap representation for each body keypoint, and is able to learn and represent both the part appearances and the context of the part configuration. We make the following three contributions: (i) an architecture combining a feed forward module with a recurrent module, where the recurrent module can be run iteratively to improve the performance, (ii) the model can be trained end-to-end and from scratch, with auxiliary losses incorporated to improve performance, (iii) we investigate whether keypoint visibility can also be predicted. The model is evaluated on two benchmark datasets. The result is a simple architecture that achieves performance on par with the state of the art, but without the complexity of a graphical model stage (or layers).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsHeatmap
