AMPose: Alternately Mixed Global-Local Attention Model for 3D Human Pose Estimation
Hongxin Lin, Yunwei Chiu, Peiyuan Wu

TL;DR
AMPose introduces an innovative model that alternates between global and local attention mechanisms using Transformers and GCNs, effectively capturing both non-local and physically connected relations among human joints for improved 3D pose estimation.
Contribution
The paper presents a novel alternating stacking of Transformer encoder and GCN blocks to better model human joint relations in 3D pose estimation.
Findings
Outperforms existing methods on Human3.6M dataset
Demonstrates strong generalization on MPI-INF-3DHP dataset
Effective integration of global and local joint relations
Abstract
The graph convolutional networks (GCNs) have been applied to model the physically connected and non-local relations among human joints for 3D human pose estimation (HPE). In addition, the purely Transformer-based models recently show promising results in video-based 3D HPE. However, the single-frame method still needs to model the physically connected relations among joints because the feature representations transformed only by global relations via the Transformer neglect information on the human skeleton. To deal with this problem, we propose a novel method in which the Transformer encoder and GCN blocks are alternately stacked, namely AMPose, to combine the global and physically connected relations among joints towards HPE. In the AMPose, the Transformer encoder is applied to connect each joint with all the other joints, while GCNs are applied to capture information on physically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Residual Connection · Dense Connections · Absolute Position Encodings · Linear Layer · Label Smoothing · Dropout · Adam · Softmax
