TL;DR
This paper introduces a probabilistic, view-invariant embedding for 2D human pose keypoints that is robust to occlusion and improves cross-view pose retrieval and action recognition without explicit 3D pose estimation.
Contribution
It proposes a novel probabilistic embedding space for 2D human poses that is view-invariant and occlusion-robust, outperforming existing 3D pose estimation methods in retrieval tasks.
Findings
Higher accuracy in cross-view pose retrieval.
Effective partial pose recognition with occlusion augmentation.
Competitive action recognition performance without additional training.
Abstract
Recognition of human poses and actions is crucial for autonomous systems to interact smoothly with people. However, cameras generally capture human poses in 2D as images and videos, which can have significant appearance variations across viewpoints that make the recognition tasks challenging. To address this, we explore recognizing similarity in 3D human body poses from 2D information, which has not been well-studied in existing works. Here, we propose an approach to learning a compact view-invariant embedding space from 2D body joint keypoints, without explicitly predicting 3D poses. Input ambiguities of 2D poses from projection and occlusion are difficult to represent through a deterministic mapping, and therefore we adopt a probabilistic formulation for our embedding space. Experimental results show that our embedding model achieves higher accuracy when retrieving similar poses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
