Efficient Gesture Recognition for the Assistance of Visually Impaired   People using Multi-Head Neural Networks

Samer Alashhab; Antonio Javier Gallego; Miguel \'Angel Lozano

arXiv:2205.06980·cs.CV·July 27, 2022

Efficient Gesture Recognition for the Assistance of Visually Impaired People using Multi-Head Neural Networks

Samer Alashhab, Antonio Javier Gallego, Miguel \'Angel Lozano

PDF

TL;DR

This paper introduces a multi-head neural network system for mobile gesture recognition to assist visually impaired users, enabling actions like object recognition and scene description through hand gestures.

Contribution

It presents a novel multi-head neural network architecture optimized for simultaneous gesture detection and action execution, trained on a large, diverse dataset.

Findings

01

Achieved high accuracy in gesture classification and localization.

02

System performs well in varied lighting and background conditions.

03

Competitive results compared to state-of-the-art methods.

Abstract

This paper proposes an interactive system for mobile devices controlled by hand gestures aimed at helping people with visual impairments. This system allows the user to interact with the device by making simple static and dynamic hand gestures. Each gesture triggers a different action in the system, such as object recognition, scene description or image scaling (e.g., pointing a finger at an object will show a description of it). The system is based on a multi-head neural network architecture, which initially detects and classifies the gestures, and subsequently, depending on the gesture detected, performs a second stage that carries out the corresponding action. This multi-head architecture optimizes the resources required to perform different tasks simultaneously, and takes advantage of the information obtained from an initial backbone to perform different processes in a second stage.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.