One for All: An End-to-End Compact Solution for Hand Gesture Recognition
Monu Verma, Ayushi Gupta, santosh kumar Vipparthi

TL;DR
This paper introduces Fit-Hand, an end-to-end compact CNN framework for hand gesture recognition that employs attention mechanisms and dilated convolutions to improve accuracy across diverse challenging conditions.
Contribution
The novel Fit-Hand architecture combines fine-grained attention and dilated convolutions in an end-to-end model, eliminating the need for complex pre-processing stages.
Findings
Achieved high accuracy on seven benchmark datasets.
Validated effectiveness through subject-dependent and independent setups.
Performed extensive ablation studies to analyze components.
Abstract
The HGR is a quite challenging task as its performance is influenced by various aspects such as illumination variations, cluttered backgrounds, spontaneous capture, etc. The conventional CNN networks for HGR are following two stage pipeline to deal with the various challenges: complex signs, illumination variations, complex and cluttered backgrounds. The existing approaches needs expert expertise as well as auxiliary computation at stage 1 to remove the complexities from the input images. Therefore, in this paper, we proposes an novel end-to-end compact CNN framework: fine grained feature attentive network for hand gesture recognition (Fit-Hand) to solve the challenges as discussed above. The pipeline of the proposed architecture consists of two main units: FineFeat module and dilated convolutional (Conv) layer. The FineFeat module extracts fine grained feature maps by employing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution · Dilated Convolution
