Pose Magic: Efficient and Temporally Consistent Human Pose Estimation   with a Hybrid Mamba-GCN Network

Xinyi Zhang; Qiqi Bao; Qinpeng Cui; Wenming Yang; Qingmin Liao

arXiv:2408.02922·cs.CV·February 27, 2025

Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network

Xinyi Zhang, Qiqi Bao, Qinpeng Cui, Wenming Yang, Qingmin Liao

PDF

Open Access 1 Video

TL;DR

Pose Magic introduces a hybrid Mamba-GCN network that achieves state-of-the-art accuracy in 3D human pose estimation while significantly reducing computational costs and maintaining temporal consistency.

Contribution

This work presents a novel hybrid spatiotemporal architecture combining Mamba and GCN for efficient, accurate, and temporally consistent 3D human pose estimation.

Findings

01

Achieves new SOTA with 0.9 mm error reduction.

02

Reduces FLOPs by 74.1%.

03

Maintains motion consistency and generalizes to unseen sequences.

Abstract

Current state-of-the-art (SOTA) methods in 3D Human Pose Estimation (HPE) are primarily based on Transformers. However, existing Transformer-based 3D HPE backbones often encounter a trade-off between accuracy and computational efficiency. To resolve the above dilemma, in this work, we leverage recent advances in state space models and utilize Mamba for high-quality and efficient long-range modeling. Nonetheless, Mamba still faces challenges in precisely exploiting local dependencies between joints. To address these issues, we propose a new attention-free hybrid spatiotemporal architecture named Hybrid Mamba-GCN (Pose Magic). This architecture introduces local enhancement with GCN by capturing relationships between neighboring joints, thus producing new representations to complement Mamba's outputs. By adaptively fusing representations from Mamba and GCN, Pose Magic demonstrates superior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network· underline

Taxonomy

TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Gait Recognition and Analysis

MethodsGraph Convolutional Network · Mamba: Linear-Time Sequence Modeling with Selective State Spaces