Learning Delicate Local Representations for Multi-Person Pose Estimation
Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du,, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun

TL;DR
This paper introduces the Residual Steps Network (RSN) and Pose Refine Machine (PRM) to improve multi-person pose estimation by capturing delicate local features and refining keypoints, achieving state-of-the-art results.
Contribution
The paper presents a novel RSN architecture for detailed local feature extraction and an attention-based PRM for better feature refinement, advancing pose estimation accuracy.
Findings
Won 1st place at COCO Keypoint Challenge 2019
Achieved 78.6 AP on COCO test-dev without extra data
Ensembled models reached 79.2 AP on COCO test-dev
Abstract
In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in precise keypoint localization. Additionally, we observe the output features contribute differently to final performance. To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
