LASOR: Learning Accurate 3D Human Pose and Shape Via Synthetic Occlusion-Aware Data and Neural Mesh Rendering
Kaibing Yang, Renshu Gu, Maoyu Wang, Masahiro Toyoura, Gang Xu

TL;DR
LASOR introduces a novel occlusion-aware framework for 3D human pose and shape estimation that synthesizes diverse training data and employs neural mesh rendering, achieving state-of-the-art results especially in occluded scenarios.
Contribution
The paper presents a new method combining synthetic occlusion-aware data generation and neural mesh rendering for improved 3D human pose and shape estimation under occlusions.
Findings
Achieves state-of-the-art accuracy on 3DPW and 3DPW-Crowd datasets.
Outperforms existing methods like Mesh Transformer, 3DCrowdNet, and ROMP in shape estimation.
Demonstrates robustness in occluded and diverse viewpoints.
Abstract
A key challenge in the task of human pose and shape estimation is occlusion, including self-occlusions, object-human occlusions, and inter-person occlusions. The lack of diverse and accurate pose and shape training data becomes a major bottleneck, especially for scenes with occlusions in the wild. In this paper, we focus on the estimation of human pose and shape in the case of inter-person occlusions, while also handling object-human occlusions and self-occlusion. We propose a novel framework that synthesizes occlusion-aware silhouette and 2D keypoints data and directly regress to the SMPL pose and shape parameters. A neural 3D mesh renderer is exploited to enable silhouette supervision on the fly, which contributes to great improvements in shape estimation. In addition, keypoints-and-silhouette-driven training data in panoramic viewpoints are synthesized to compensate for the lack of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Robot Manipulation and Learning
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Residual Connection · Dense Connections · Byte Pair Encoding · Absolute Position Encodings · Softmax · Dropout · Position-Wise Feed-Forward Layer
