DIR-BHRNet: A Lightweight Network for Real-time Vision-based Multi-person Pose Estimation on Smartphones
Gongjin Lan, Yu Wu, Qi Hao

TL;DR
This paper introduces DIR-BHRNet, a lightweight neural network designed for real-time multi-person pose estimation on smartphones, achieving high accuracy and over 10 FPS on Android devices.
Contribution
The paper proposes a novel lightweight convolutional module and an efficient network structure to enable real-time multi-person pose estimation on low-performance mobile devices.
Findings
Outperforms state-of-the-art methods in accuracy on COCO and CrowdPose datasets.
Achieves over 10 FPS on mainstream Android smartphones.
Provides publicly available source code and executable for easy deployment.
Abstract
Human pose estimation (HPE), particularly multi-person pose estimation (MPPE), has been applied in many domains such as human-machine systems. However, the current MPPE methods generally run on powerful GPU systems and take a lot of computational costs. Real-time MPPE on mobile devices with low-performance computing is a challenging task. In this paper, we propose a lightweight neural network, DIR-BHRNet, for real-time MPPE on smartphones. In DIR-BHRNet, we design a novel lightweight convolutional module, Dense Inverted Residual (DIR), to improve accuracy by adding a depthwise convolution and a shortcut connection into the well-known Inverted Residual, and a novel efficient neural network structure, Balanced HRNet (BHRNet), to reduce computational costs by reconfiguring the proper number of convolutional blocks on each branch. We evaluate DIR-BHRNet on the well-known COCO and CrowdPose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Hand Gesture Recognition Systems
