Weakly-supervised 3D Human Pose Estimation with Cross-view U-shaped Graph Convolutional Network
Guoliang Hua, Hong Liu, Wenhao Li, Qian Zhang, Runwei Ding, Xin Xu

TL;DR
This paper introduces a weakly-supervised cross-view 3D human pose estimation method that leverages two camera views and a novel graph convolutional network to achieve state-of-the-art accuracy without requiring 3D ground truth annotations.
Contribution
It proposes a new pipeline combining triangulation and a cross-view U-shaped graph convolutional network for weakly-supervised 3D pose estimation from two views.
Findings
Achieves 27.4 mm MPJPE on Human3.6M, outperforming previous methods.
Uses only 2D annotations and no 3D ground truth for training.
Demonstrates effectiveness of cross-view GCN in pose refinement.
Abstract
Although monocular 3D human pose estimation methods have made significant progress, it is far from being solved due to the inherent depth ambiguity. Instead, exploiting multi-view information is a practical way to achieve absolute 3D human pose estimation. In this paper, we propose a simple yet effective pipeline for weakly-supervised cross-view 3D human pose estimation. By only using two camera views, our method can achieve state-of-the-art performance in a weakly-supervised manner, requiring no 3D ground truth but only 2D annotations. Specifically, our method contains two steps: triangulation and refinement. First, given the 2D keypoints that can be obtained through any classic 2D detection methods, triangulation is performed across two views to lift the 2D keypoints into coarse 3D poses. Then, a novel cross-view U-shaped graph convolutional network (CV-UGCN), which can explore the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Hand Gesture Recognition Systems
