AirGlove: Exploring Egocentric 3D Hand Tracking and Appearance Generalization for Sensing Gloves
Wenhui Cui, Ziyi Kou, Chuan Qin, Ergys Ristani, Li Guan

TL;DR
This paper evaluates vision-based 3D hand tracking models on sensing gloves, identifies performance gaps due to appearance differences, and introduces AirGlove to improve generalization across glove types with limited data.
Contribution
The paper systematically assesses existing models on gloved hands and proposes AirGlove, a method that enhances generalization to new glove designs using limited data.
Findings
Existing models perform poorly on gloved hands due to appearance gap.
AirGlove significantly improves generalization to new glove types.
Experiments show AirGlove outperforms baseline schemes.
Abstract
Sensing gloves have become important tools for teleoperation and robotic policy learning as they are able to provide rich signals like speed, acceleration and tactile feedback. A common approach to track gloved hands is to directly use the sensor signals (e.g., angular velocity, gravity orientation) to estimate 3D hand poses. However, sensor-based tracking can be restrictive in practice as the accuracy is often impacted by sensor signal and calibration quality. Recent advances in vision-based approaches have achieved strong performance on human hands via large-scale pre-training, but their performance on gloved hands with distinct visual appearances remains underexplored. In this work, we present the first systematic evaluation of vision-based hand tracking models on gloved hands under both zero-shot and fine-tuning setups. Our analysis shows that existing bare-hand models suffer from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Robot Manipulation and Learning · Human Pose and Action Recognition
