3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information
Sihan Wen, Xiantan Zhu, Zhiming Tan

TL;DR
This paper introduces a novel Semantic Graph Attention Network for 3D whole-body pose estimation that combines self-attention, graph convolutions, distance information, and geometry constraints to improve accuracy and structural consistency.
Contribution
The paper presents a new Semantic Graph Attention Network with a Body Part Decoder and Geometry Loss, integrating multiple techniques to enhance 3D pose estimation accuracy.
Findings
Outperforms existing state-of-the-art methods on benchmark datasets.
Effectively captures global context and local structural constraints.
Improves spatial relationship understanding through distance information.
Abstract
In recent years, a plethora of diverse methods have been proposed for 3D pose estimation. Among these, self-attention mechanisms and graph convolutions have both been proven to be effective and practical methods. Recognizing the strengths of those two techniques, we have developed a novel Semantic Graph Attention Network which can benefit from the ability of self-attention to capture global context, while also utilizing the graph convolutions to handle the local connectivity and structural constraints of the skeleton. We also design a Body Part Decoder that assists in extracting and refining the information related to specific segments of the body. Furthermore, our approach incorporates Distance Information, enhancing our model's capability to comprehend and accurately predict spatial relationships. Finally, we introduce a Geometry Loss who makes a critical constraint on the structural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Image Processing and 3D Reconstruction
