Simple and Lightweight Human Pose Estimation
Zhe Zhang, Jie Tang, Gangshan Wu

TL;DR
This paper introduces a lightweight human pose estimation network that achieves competitive accuracy with significantly reduced model size and computational cost, making it suitable for practical deployment on resource-constrained devices.
Contribution
The paper proposes a novel lightweight bottleneck block and a pose network architecture that drastically reduces model size and FLOPs while maintaining high accuracy.
Findings
LPN-50 achieves 68.7 AP on COCO test-dev.
Model size is only 9% of ResNet50-based methods.
Inference speed reaches 17 FPS on CPU.
Abstract
Recent research on human pose estimation has achieved significant improvement. However, most existing methods tend to pursue higher scores using complex architecture or computationally expensive models on benchmark datasets, ignoring the deployment costs in practice. In this paper, we investigate the problem of simple and lightweight human pose estimation. We first redesign a lightweight bottleneck block with two non-novel concepts: depthwise convolution and attention mechanism. And then, based on the lightweight block, we present a Lightweight Pose Network (LPN) following the architecture design principles of SimpleBaseline. The model size (#Params) of our small network LPN-50 is only 9% of SimpleBaseline(ResNet50), and the computational complexity (FLOPs) is only 11%. To give full play to the potential of our LPN and get more accurate predicted results, we also propose an iterative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Hand Gesture Recognition Systems
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Depthwise Convolution · Convolution
