Coherent Reconstruction of Multiple Humans from a Single Image
Wen Jiang, Nikos Kolotouros, Georgios Pavlakos, Xiaowei Zhou, Kostas, Daniilidis

TL;DR
This paper presents a novel method for multi-person 3D pose estimation from a single image that ensures coherent reconstructions by incorporating a parametric body model and novel loss functions to prevent interpenetration and maintain correct depth ordering.
Contribution
The authors introduce a single network framework with collision and depth ordering losses, improving coherence in multi-person 3D reconstructions from images without explicit 3D annotations.
Findings
Outperforms previous methods on standard 3D pose benchmarks.
Enables more coherent multi-person reconstructions in natural images.
Uses novel loss functions to enforce physical plausibility and correct depth ordering.
Abstract
In this work, we address the problem of multi-person 3D pose estimation from a single image. A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently. However, this type of prediction suffers from incoherent results, e.g., interpenetration and inconsistent depth ordering between the people in the scene. Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene. To this end, a key design choice is the incorporation of the SMPL parametric body model in our top-down framework, which enables the use of two novel losses. First, a distance field-based collision loss penalizes interpenetration among the reconstructed people. Second, a depth ordering-aware loss reasons about occlusions and promotes a depth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Coherent Reconstruction of Multiple Humans From a Single Image· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Video Surveillance and Tracking Methods
