Novel Human Machine Interface via Robust Hand Gesture Recognition System using Channel Pruned YOLOv5s Model
Abir Sen, Tapas Kumar Mishra, Ratnakar Dash

TL;DR
This paper presents a fast, robust hand gesture recognition system using a channel-pruned YOLOv5s model, significantly improving detection speed and accuracy for real-time human-computer interaction applications.
Contribution
It introduces a novel channel pruning approach for YOLOv5s, enhancing real-time gesture detection performance for HCI systems.
Findings
Achieved over 60 fps detection speed in real-time scenarios.
Outperformed state-of-the-art methods in accuracy metrics.
Demonstrated effective deployment in multimedia control applications.
Abstract
Hand gesture recognition (HGR) is a vital component in enhancing the human-computer interaction experience, particularly in multimedia applications, such as virtual reality, gaming, smart home automation systems, etc. Users can control and navigate through these applications seamlessly by accurately detecting and recognizing gestures. However, in a real-time scenario, the performance of the gesture recognition system is sometimes affected due to the presence of complex background, low-light illumination, occlusion problems, etc. Another issue is building a fast and robust gesture-controlled human-computer interface (HCI) in the real-time scenario. The overall objective of this paper is to develop an efficient hand gesture detection and classification model using a channel-pruned YOLOv5-small model and utilize the model to build a gesture-controlled HCI with a quick response time (in ms)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · IoT-based Smart Home Systems · Robotics and Automated Systems
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
