$P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose Estimation
Luanxuan Hou, Jie Cao, Yuan Zhao, Haifeng Shen, Jian Tang, Ran He

TL;DR
This paper introduces $P^2$ Net, a novel attention-guided pose estimation network with a differentiable data augmentation method, parallel-pyramid structure, and feature refinement techniques, achieving state-of-the-art results on MSCOCO and MPII datasets.
Contribution
The paper presents a new augmented parallel-pyramid network with innovative fusion and refinement modules, along with a differentiable data augmentation strategy for improved pose estimation.
Findings
Achieves top performance on MSCOCO dataset.
Outperforms existing methods on MPII dataset.
Demonstrates effectiveness of attention and fusion modules.
Abstract
We propose an augmented Parallel-Pyramid Net () with feature refinement by dilated bottleneck and attention module. During data preprocessing, we proposed a differentiable auto data augmentation () method. We formulate the problem of searching data augmentaion policy in a differentiable form, so that the optimal policy setting can be easily updated by back propagation during training. improves the training efficiency. A parallel-pyramid structure is followed to compensate the information loss introduced by the network. We innovate two fusion structures, i.e. Parallel Fusion and Progressive Fusion, to process pyramid features from backbone network. Both fusion structures leverage the advantages of spatial information affluence at high resolution and semantic comprehension at low resolution effectively. We propose a refinement stage for the pyramid features to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Image and Object Detection Techniques
