Pose-Oriented Transformer with Uncertainty-Guided Refinement for 2D-to-3D Human Pose Estimation
Han Li, Bowen Shi, Wenrui Dai, Hongwei Zheng, Botao Wang, Yu Sun, Min, Guo, Chenlin Li, Junni Zou, Hongkai Xiong

TL;DR
This paper introduces a novel pose-oriented transformer with uncertainty-guided refinement that explicitly models human skeleton topology and joint difficulty, significantly improving 3D human pose estimation accuracy.
Contribution
The paper proposes a pose-oriented self-attention mechanism and an uncertainty-guided refinement network, incorporating skeleton topology and joint difficulty into transformer-based 3D human pose estimation.
Findings
Outperforms state-of-the-art methods on Human3.6M and MPI-INF-3DHP datasets.
Reduces model parameters while maintaining high accuracy.
Effectively models joint interactions and difficulty levels for improved pose estimation.
Abstract
There has been a recent surge of interest in introducing transformers to 3D human pose estimation (HPE) due to their powerful capabilities in modeling long-term dependencies. However, existing transformer-based methods treat body joints as equally important inputs and ignore the prior knowledge of human skeleton topology in the self-attention mechanism. To tackle this issue, in this paper, we propose a Pose-Oriented Transformer (POT) with uncertainty guided refinement for 3D HPE. Specifically, we first develop novel pose-oriented self-attention mechanism and distance-related position embedding for POT to explicitly exploit the human skeleton topology. The pose-oriented self-attention mechanism explicitly models the topological interactions between body joints, whereas the distance-related position embedding encodes the distance of joints to the root joint to distinguish groups of joints…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Linear Layer · Dense Connections · Label Smoothing · Absolute Position Encodings · Adam · Position-Wise Feed-Forward Layer · Softmax
