Fast-HaMeR: Boosting Hand Mesh Reconstruction using Knowledge Distillation
Hunain Ahmed Jillani, Ahmed Tawfik Aboukhadra, Ahmed Elhayek, Jameel Malik, Nadia Robertini, Didier Stricker

TL;DR
Fast-HaMeR demonstrates that lightweight neural networks combined with knowledge distillation can significantly accelerate 3D hand reconstruction models like HaMeR, enabling real-time performance on resource-limited devices without substantial accuracy loss.
Contribution
This work introduces a method to replace heavy backbones in the HaMeR hand reconstruction model with lightweight alternatives using knowledge distillation, achieving faster inference with minimal accuracy loss.
Findings
Lightweight backbones achieve 1.5x faster inference.
Minimal accuracy difference of 0.4mm with smaller models.
Output-level distillation improves student performance.
Abstract
Fast and accurate 3D hand reconstruction is essential for real-time applications in VR/AR, human-computer interaction, robotics, and healthcare. Most state-of-the-art methods rely on heavy models, limiting their use on resource-constrained devices like headsets, smartphones, and embedded systems. In this paper, we investigate how the use of lightweight neural networks, combined with Knowledge Distillation, can accelerate complex 3D hand reconstruction models by making them faster and lighter, while maintaining comparable reconstruction accuracy. While our approach is suited for various hand reconstruction frameworks, we focus primarily on boosting the HaMeR model, currently the leading method in terms of reconstruction accuracy. We replace its original ViT-H backbone with lighter alternatives, including MobileNet, MobileViT, ConvNeXt, and ResNet, and evaluate three knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Interactive and Immersive Displays
