HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
Wencan Cheng, Eunji Kim, Jong Hwan Ko

TL;DR
HandDAGT introduces a novel transformer-based approach with a denoising training strategy to improve 3D hand pose estimation accuracy and robustness under occlusion conditions, outperforming existing methods.
Contribution
This paper presents HandDAGT, a new graph transformer model with an adaptive attention mechanism and denoising training for robust 3D hand pose estimation.
Findings
Significantly outperforms existing methods on four benchmark datasets.
Effectively handles self-occlusion and object interaction scenarios.
Demonstrates robustness and accuracy improvements in challenging conditions.
Abstract
The extraction of keypoint positions from input hand frames, known as 3D hand pose estimation, is crucial for various human-computer interaction applications. However, current approaches often struggle with the dynamic nature of self-occlusion of hands and intra-occlusion with interacting objects. To address this challenge, this paper proposes the Denoising Adaptive Graph Transformer, HandDAGT, for hand pose estimation. The proposed HandDAGT leverages a transformer structure to thoroughly explore effective geometric features from input patches. Additionally, it incorporates a novel attention mechanism to adaptively weigh the contribution of kinematic correspondence and local geometric features for the estimation of specific keypoints. This attribute enables the model to adaptively employ kinematic and local information based on the occlusion situation, enhancing its robustness and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Gait Recognition and Analysis
MethodsAttention Is All You Need · Laplacian EigenMap · Label Smoothing · Laplacian Positional Encodings · Adam · Linear Layer · Byte Pair Encoding · Layer Normalization · Softmax · Position-Wise Feed-Forward Layer
