Rethinking on Multi-Stage Networks for Human Pose Estimation

Wenbo Li; Zhicheng Wang; Binyi Yin; Qixiang Peng; Yuming Du; Tianzi; Xiao; Gang Yu; Hongtao Lu; Yichen Wei; and Jian Sun

arXiv:1901.00148·cs.CV·May 31, 2019·109 cites

Rethinking on Multi-Stage Networks for Human Pose Estimation

Wenbo Li, Zhicheng Wang, Binyi Yin, Qixiang Peng, Yuming Du, Tianzi, Xiao, Gang Yu, Hongtao Lu, Yichen Wei, and Jian Sun

PDF

Open Access 5 Repos

TL;DR

This paper improves multi-stage human pose estimation by addressing design limitations, introducing new modules and supervision strategies, leading to state-of-the-art results on MS COCO and MPII datasets.

Contribution

It proposes novel design enhancements for multi-stage networks, including a single-stage module, cross-stage feature aggregation, and coarse-to-fine supervision, significantly boosting performance.

Findings

01

Achieved new state-of-the-art on MS COCO and MPII datasets.

02

Demonstrated the effectiveness of multi-stage architecture with proposed improvements.

03

Published source code for further research.

Abstract

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While multi-stage methods are seemingly more suited for the task, their performance in current practice is not as good as single-stage methods. This work studies this issue. We argue that the current multi-stage methods' unsatisfactory performance comes from the insufficiency in various design choices. We propose several improvements, including the single-stage module design, cross stage feature aggregation, and coarse-to-fine supervision. The resulting method establishes the new state-of-the-art on both MS COCO and MPII Human Pose dataset, justifying the effectiveness of a multi-stage architecture. The source code is publicly available for further research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Advanced Vision and Imaging