AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
Hao-Shu Fang, Jiefeng Li, Hongyang Tang, Chao Xu, Haoyi Zhu, Yuliang, Xiu, Yong-Lu Li, Cewu Lu

TL;DR
AlphaPose is a real-time system for accurate whole-body multi-person pose estimation and tracking, integrating novel techniques to localize keypoints and track humans even with imperfect detections, outperforming existing methods.
Contribution
The paper introduces AlphaPose, a system with new techniques for fast, accurate, and real-time whole-body pose estimation and tracking, including methods for localization, detection, and identity embedding.
Findings
Significant speed and accuracy improvements over state-of-the-art methods.
Effective localization of whole-body keypoints including face, hands, and feet.
Robust tracking of humans despite inaccurate bounding boxes and redundant detections.
Abstract
Accurate whole-body multi-person pose estimation and tracking is an important yet challenging topic in computer vision. To capture the subtle actions of humans for complex behavior analysis, whole-body pose estimation including the face, body, hand and foot is essential over conventional body-only pose estimation. In this paper, we present AlphaPose, a system that can perform accurate whole-body pose estimation and tracking jointly while running in realtime. To this end, we propose several new techniques: Symmetric Integral Keypoint Regression (SIKR) for fast and fine localization, Parametric Pose Non-Maximum-Suppression (P-NMS) for eliminating redundant human detections and Pose Aware Identity Embedding for jointly pose estimation and tracking. During training, we resort to Part-Guided Proposal Generator (PGPG) and multi-domain knowledge distillation to further improve the accuracy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Gait Recognition and Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Knowledge Distillation · Attentive Walk-Aggregating Graph Neural Network
