MovePose: A High-performance Human Pose Estimation Algorithm on Mobile   and Edge Devices

Dongyang Yu; Haoyue Zhang; Ruisheng Zhao; Guoqi Chen and; Wangpeng An; Yanhong Yang

arXiv:2308.09084·cs.CV·July 25, 2024

MovePose: A High-performance Human Pose Estimation Algorithm on Mobile and Edge Devices

Dongyang Yu, Haoyue Zhang, Ruisheng Zhao, Guoqi Chen and, Wangpeng An, Yanhong Yang

PDF

Open Access

TL;DR

MovePose is a lightweight, high-performance human pose estimation algorithm optimized for mobile and edge devices, achieving real-time speed and improved accuracy compared to existing solutions.

Contribution

It introduces a novel lightweight CNN architecture with techniques like deconvolution and large kernel convolution for enhanced mobile human pose estimation.

Findings

01

Achieved 68.0 mAP on COCO validation dataset.

02

Real-time performance with 69+ fps on CPU and 452+ fps on GPU.

03

Over 11 fps on an Android smartphone with Snapdragon processor.

Abstract

We present MovePose, an optimized lightweight convolutional neural network designed specifically for real-time body pose estimation on CPU-based mobile devices. The current solutions do not provide satisfactory accuracy and speed for human posture estimation, and MovePose addresses this gap. It aims to maintain real-time performance while improving the accuracy of human posture estimation for mobile devices. Our MovePose algorithm has attained an Mean Average Precision (mAP) score of 68.0 on the COCO \cite{cocodata} validation dataset. The MovePose algorithm displayed efficiency with a performance of 69+ frames per second (fps) when run on an Intel i9-10920x CPU. Additionally, it showcased an increased performance of 452+ fps on an NVIDIA RTX3090 GPU. On an Android phone equipped with a Snapdragon 8 + 4G processor, the fps reached above 11. To enhance accuracy, we incorporated three…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Context-Aware Activity Recognition Systems

MethodsConvolution · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings