TL;DR
This paper introduces a probabilistic approach using normalizing flows for monocular 3D human pose estimation, generating diverse pose hypotheses and effectively handling uncertainties and occlusions.
Contribution
It presents a novel normalizing flow-based method that models the full posterior distribution of 3D poses from monocular images, improving diversity and accuracy.
Findings
Outperforms existing methods on Human3.6M and MPI-INF-3DHP datasets
Generates diverse pose hypotheses capturing ambiguities
Effectively models uncertainties and occlusions
Abstract
3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exploits the deterministic 3D-to-2D mapping to solve the ambiguous inverse 2D-to-3D problem. Additionally, uncertain detections and occlusions are effectively modeled by incorporating uncertainty information of the 2D detector as condition. Further keys to success are a learned 3D pose prior and a generalization of the best-of-M loss. We evaluate our approach on the two benchmark datasets Human3.6M and MPI-INF-3DHP, outperforming all comparable methods in most metrics. The implementation is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
