MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices
Dongyang Yu, Haoyue Zhang, Ruisheng Zhao, Guoqi Chen and, Wangpeng An, Yanhong Yang

TL;DR
MovePose is a lightweight, high-performance human pose estimation algorithm optimized for mobile and edge devices, achieving real-time speed and improved accuracy compared to existing solutions.
Contribution
It introduces a novel lightweight CNN architecture with techniques like deconvolution and large kernel convolution for enhanced mobile human pose estimation.
Findings
Achieved 68.0 mAP on COCO validation dataset.
Real-time performance with 69+ fps on CPU and 452+ fps on GPU.
Over 11 fps on an Android smartphone with Snapdragon processor.
Abstract
We present MovePose, an optimized lightweight convolutional neural network designed specifically for real-time body pose estimation on CPU-based mobile devices. The current solutions do not provide satisfactory accuracy and speed for human posture estimation, and MovePose addresses this gap. It aims to maintain real-time performance while improving the accuracy of human posture estimation for mobile devices. Our MovePose algorithm has attained an Mean Average Precision (mAP) score of 68.0 on the COCO \cite{cocodata} validation dataset. The MovePose algorithm displayed efficiency with a performance of 69+ frames per second (fps) when run on an Intel i9-10920x CPU. Additionally, it showcased an increased performance of 452+ fps on an NVIDIA RTX3090 GPU. On an Android phone equipped with a Snapdragon 8 + 4G processor, the fps reached above 11. To enhance accuracy, we incorporated three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Context-Aware Activity Recognition Systems
MethodsConvolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
