1st Place Solution of Egocentric 3D Hand Pose Estimation Challenge 2023 Technical Report:A Concise Pipeline for Egocentric Hand Pose Reconstruction
Zhishan Zhou, Zhi Lv, Shihao Zhou, Minqiang Zou, Tong Wu, Mochen Yu,, Yao Tang, Jiajun Liang

TL;DR
This paper presents the winning solution for the 2023 Egocentric 3D Hand Pose Estimation Challenge, combining ViT-based models, multi-view post-processing, and data augmentation to improve accuracy in challenging occlusion scenarios.
Contribution
The paper introduces a simple yet effective pipeline using ViT backbones, multi-view merging, and data augmentation, achieving state-of-the-art results in egocentric 3D hand pose estimation.
Findings
Achieved 12.21mm MPJPE on test dataset
Multi-view post-processing improves occlusion handling
Test time augmentation and ensemble boost accuracy
Abstract
This report introduce our work on Egocentric 3D Hand Pose Estimation workshop. Using AssemblyHands, this challenge focuses on egocentric 3D hand pose estimation from a single-view image. In the competition, we adopt ViT based backbones and a simple regressor for 3D keypoints prediction, which provides strong model baselines. We noticed that Hand-objects occlusions and self-occlusions lead to performance degradation, thus proposed a non-model method to merge multi-view results in the post-process stage. Moreover, We utilized test time augmentation and model ensemble to make further improvement. We also found that public dataset and rational preprocess are beneficial. Our method achieved 12.21mm MPJPE on test dataset, achieve the first place in Egocentric 3D Hand Pose Estimation challenge.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems
