Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
Yu Cheng, Bo Wang, Bo Yang, Robby T. Tan

TL;DR
This paper introduces a novel framework combining graph and temporal convolutional networks to improve 3D multi-person pose estimation from monocular videos, effectively handling occlusions and missing data without camera parameters.
Contribution
The paper proposes a new human-joint and human-bone GCNs integrated with TCNs for robust, camera-agnostic 3D pose estimation in monocular videos, addressing occlusion and temporal consistency.
Findings
Outperforms existing methods in accuracy on benchmark datasets.
Effectively handles occlusion and missing data in multi-person scenarios.
Achieves camera-centric 3D pose estimation without camera calibration.
Abstract
Despite the recent progress, 3D multi-person pose estimation from monocular videos is still challenging due to the commonly encountered problem of missing information caused by occlusion, partially out-of-frame target persons, and inaccurate person detection. To tackle this problem, we propose a novel framework integrating graph convolutional networks (GCNs) and temporal convolutional networks (TCNs) to robustly estimate camera-centric multi-person 3D poses that do not require camera parameters. In particular, we introduce a human-joint GCN, which, unlike the existing GCN, is based on a directed graph that employs the 2D pose estimator's confidence scores to improve the pose estimation results. We also introduce a human-bone GCN, which models the bone connections and provides more information beyond human joints. The two GCNs work together to estimate the spatial frame-wise 3D poses and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Diabetic Foot Ulcer Assessment and Management
MethodsGraph Convolutional Networks · Graph Convolutional Network
