Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation
Xipeng Chen, Kwan-Yee Lin, Wentao Liu, Chen Qian, Xiaogang Wang, Liang, Lin

TL;DR
This paper introduces a weakly-supervised, geometry-aware 3D human pose representation learned from multi-view data and 2D keypoints, improving 3D pose estimation robustness and generalizability across environments.
Contribution
It proposes a novel skeleton-based auto-encoder with view synthesis and representation consistency constraints for learning 3D geometry-aware pose representations using minimal supervision.
Findings
Significantly improves state-of-the-art 3D pose estimation accuracy.
Effective in cross-environment generalization.
Utilizes only 2D keypoints and multi-view data for training.
Abstract
Recent studies have shown remarkable advances in 3D human pose estimation from monocular images, with the help of large-scale in-door 3D datasets and sophisticated network architectures. However, the generalizability to different environments remains an elusive goal. In this work, we propose a geometry-aware 3D representation for the human pose to address this limitation by using multiple views in a simple auto-encoder model at the training stage and only 2D keypoint information as supervision. A view synthesis framework is proposed to learn the shared 3D representation between viewpoints with synthesizing the human pose from one viewpoint to the other one. Instead of performing a direct transfer in the raw image-level, we propose a skeleton-based encoder-decoder mechanism to distil only pose-related representation in the latent space. A learning-based representation consistency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Video Surveillance and Tracking Methods
