Efficient Gesture Recognition for the Assistance of Visually Impaired People using Multi-Head Neural Networks
Samer Alashhab, Antonio Javier Gallego, Miguel \'Angel Lozano

TL;DR
This paper introduces a multi-head neural network system for mobile gesture recognition to assist visually impaired users, enabling actions like object recognition and scene description through hand gestures.
Contribution
It presents a novel multi-head neural network architecture optimized for simultaneous gesture detection and action execution, trained on a large, diverse dataset.
Findings
Achieved high accuracy in gesture classification and localization.
System performs well in varied lighting and background conditions.
Competitive results compared to state-of-the-art methods.
Abstract
This paper proposes an interactive system for mobile devices controlled by hand gestures aimed at helping people with visual impairments. This system allows the user to interact with the device by making simple static and dynamic hand gestures. Each gesture triggers a different action in the system, such as object recognition, scene description or image scaling (e.g., pointing a finger at an object will show a description of it). The system is based on a multi-head neural network architecture, which initially detects and classifies the gestures, and subsequently, depending on the gesture detected, performs a second stage that carries out the corresponding action. This multi-head architecture optimizes the resources required to perform different tasks simultaneously, and takes advantage of the information obtained from an initial backbone to perform different processes in a second stage.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
