Cascade Feature Aggregation for Human Pose Estimation
Zhihui Su, Ming Ye, Guohui Zhang, Lei Dai, Jianda Sheng

TL;DR
This paper introduces a Cascade Feature Aggregation method that combines multiple hourglass networks to enhance human pose estimation accuracy by leveraging multi-stage features and fusion, demonstrating superior performance on benchmark datasets.
Contribution
The novel CFA approach cascades hourglass networks and aggregates features across stages for improved robustness and localization in pose estimation.
Findings
Outperforms state-of-the-art on MPII dataset
Achieves best results on MPII benchmark
Demonstrates robustness to occlusions and low resolution
Abstract
Human pose estimation plays an important role in many computer vision tasks and has been studied for many decades. However, due to complex appearance variations from poses, illuminations, occlusions and low resolutions, it still remains a challenging problem. Taking the advantage of high-level semantic information from deep convolutional neural networks is an effective way to improve the accuracy of human pose estimation. In this paper, we propose a novel Cascade Feature Aggregation (CFA) method, which cascades several hourglass networks for robust human pose estimation. Features from different stages are aggregated to obtain abundant contextual information, leading to robustness to poses, partial occlusions and low resolution. Moreover, results from different stages are fused to further improve the localization accuracy. The extensive experiments on MPII datasets and LIP datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Anomaly Detection Techniques and Applications
