Cascaded Pyramid Network for Multi-Person Pose Estimation
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, Jian, Sun

TL;DR
This paper introduces the Cascaded Pyramid Network (CPN), a two-stage deep learning architecture designed to improve multi-person pose estimation, especially for occluded or invisible keypoints, achieving state-of-the-art results on COCO benchmarks.
Contribution
The paper proposes a novel CPN architecture with GlobalNet and RefineNet to effectively handle hard keypoints and integrates online hard keypoint mining for improved accuracy.
Findings
Achieved 73.0 AP on COCO test-dev, a 19% improvement over previous methods.
Successfully localizes occluded and invisible keypoints in multi-person scenarios.
Outperforms existing state-of-the-art on COCO keypoint benchmark.
Abstract
The topic of multi-person pose estimation has been largely improved recently, especially with the development of convolutional neural network. However, there still exist a lot of challenging cases, such as occluded keypoints, invisible keypoints and complex background, which cannot be well addressed. In this paper, we present a novel network structure called Cascaded Pyramid Network (CPN) which targets to relieve the problem from these "hard" keypoints. More specifically, our algorithm includes two stages: GlobalNet and RefineNet. GlobalNet is a feature pyramid network which can successfully localize the "simple" keypoints like eyes and hands but may fail to precisely recognize the occluded or invisible keypoints. Our RefineNet tries explicitly handling the "hard" keypoints by integrating all levels of feature representations from the GlobalNet together with an online hard keypoint…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
