Spatio-temporal MLP-graph network for 3D human pose estimation
Tanvir Hassan, A. Ben Hamza

TL;DR
This paper introduces a novel spatio-temporal MLP-graph network that leverages temporal data and adaptive graph modulation to improve 3D human pose estimation, especially under occlusions and ambiguities.
Contribution
It proposes a new architecture combining joint-mixing MLP blocks and graph weighted Jacobi networks with learnable modulation, advancing 3D pose estimation accuracy.
Findings
Outperforms recent state-of-the-art methods on benchmark datasets.
Effectively captures temporal correlations and joint relationships.
Demonstrates robustness to occlusions and ambiguous poses.
Abstract
Graph convolutional networks and their variants have shown significant promise in 3D human pose estimation. Despite their success, most of these methods only consider spatial correlations between body joints and do not take into account temporal correlations, thereby limiting their ability to capture relationships in the presence of occlusions and inherent ambiguity. To address this potential weakness, we propose a spatio-temporal network architecture composed of a joint-mixing multi-layer perceptron block that facilitates communication among different joints and a graph weighted Jacobi network block that enables communication among various feature channels. The major novelty of our approach lies in a new weighted Jacobi feature propagation rule obtained through graph filtering with implicit fairing. We leverage temporal information from the 2D pose sequences, and integrate weight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Infrared Thermography in Medicine
