SasMamba: A Lightweight Structure-Aware Stride State Space Model for 3D Human Pose Estimation
Hu Cui, Wenqiang Hua, Renjing Huang, Shurui Jia, Tessai Hayama

TL;DR
SasMamba introduces a lightweight, structure-aware stride state space model that effectively captures local and global 3D human pose dependencies with linear complexity, improving efficiency and performance.
Contribution
The paper presents SAS-SSM, a novel structure-aware spatiotemporal convolution and stride-based scan strategy, enhancing pose modeling while reducing parameters.
Findings
Achieves competitive 3D pose estimation accuracy.
Uses fewer parameters than existing hybrid models.
Maintains linear computational complexity.
Abstract
Recently, the Mamba architecture based on State Space Models (SSMs) has gained attention in 3D human pose estimation due to its linear complexity and strong global modeling capability. However, existing SSM-based methods typically apply manually designed scan operations to flatten detected 2D pose sequences into purely temporal sequences, either locally or globally. This approach disrupts the inherent spatial structure of human poses and entangles spatial and temporal features, making it difficult to capture complex pose dependencies. To address these limitations, we propose the Skeleton Structure-Aware Stride SSM (SAS-SSM), which first employs a structure-aware spatiotemporal convolution to dynamically capture essential local interactions between joints, and then applies a stride-based scan strategy to construct multi-scale global structural representations. This enables flexible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Robot Manipulation and Learning
