Convolutional Pose Machines
Shih-En Wei, Varun Ramakrishna, Takeo Kanade, Yaser Sheikh

TL;DR
This paper introduces a convolutional pose machine framework that models long-range dependencies for pose estimation, achieving state-of-the-art results without explicit graphical models.
Contribution
It presents a systematic way to incorporate convolutional networks into pose machines, enabling implicit modeling of spatial dependencies and refined part localization.
Findings
Achieves state-of-the-art performance on MPII, LSP, and FLIC datasets.
Addresses vanishing gradient issues with intermediate supervision.
Outperforms existing methods in pose estimation benchmarks.
Abstract
Pose Machines provide a sequential prediction framework for learning rich implicit spatial models. In this work we show a systematic design for how convolutional networks can be incorporated into the pose machine framework for learning image features and image-dependent spatial models for the task of pose estimation. The contribution of this paper is to implicitly model long-range dependencies between variables in structured prediction tasks such as articulated pose estimation. We achieve this by designing a sequential architecture composed of convolutional networks that directly operate on belief maps from previous stages, producing increasingly refined estimates for part locations, without the need for explicit graphical model-style inference. Our approach addresses the characteristic difficulty of vanishing gradients during training by providing a natural learning objective function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Convolutional Pose Machines· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization
