PostoMETRO: Pose Token Enhanced Mesh Transformer for Robust 3D Human Mesh Recovery
Wendi Yang, Zihang Jiang, Shang Zhao, S. Kevin Zhou

TL;DR
PostoMETRO introduces a novel transformer-based approach that uses a pose tokenizer to enhance 3D human mesh recovery, especially under occlusion, by effectively integrating 2D pose information with image features.
Contribution
The paper proposes a pose tokenization method within transformers to improve 3D human mesh recovery under occlusion, leveraging rich 2D pose annotations more effectively.
Findings
Outperforms existing methods on standard and occlusion-specific benchmarks.
Produces more accurate 3D meshes under occlusion scenarios.
Demonstrates the effectiveness of pose tokens in 3D reconstruction.
Abstract
With the recent advancements in single-image-based human mesh recovery, there is a growing interest in enhancing its performance in certain extreme scenarios, such as occlusion, while maintaining overall model accuracy. Although obtaining accurately annotated 3D human poses under occlusion is challenging, there is still a wealth of rich and precise 2D pose annotations that can be leveraged. However, existing works mostly focus on directly leveraging 2D pose coordinates to estimate 3D pose and mesh. In this paper, we present PostoMETRO(e ken enhanced sh ansfrmer), which integrates occlusion-resilient 2D pose representation into transformers in a token-wise manner. Utilizing a specialized pose tokenizer, we efficiently condense 2D pose data to a compact sequence of pose tokens and feed them to the transformer together with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Human Pose and Action Recognition · Gait Recognition and Analysis
MethodsFocus
