Multimodal Appearance based Gaze-Controlled Virtual Keyboard with Synchronous Asynchronous Interaction for Low-Resource Settings
Yogesh Kumar Meena, Manish Salvi

TL;DR
This paper introduces a multimodal gaze-controlled virtual keyboard using deep learning and standard webcams, enabling efficient typing for individuals with mobility impairments, especially in low-resource settings.
Contribution
It presents a novel multimodal appearance-based gaze system with synchronous and asynchronous modes, supporting comprehensive command and character input using standard hardware.
Findings
Average typing speed of 10.94 letters/min with webcam in synchronous mode
ITRs of approximately 63.56 bits/min at the letter level with webcam in synchronous mode
Demonstrated good usability and low workload in low-resource environments.
Abstract
Over the past decade, the demand for communication devices has increased among individuals with mobility and speech impairments. Eye-gaze tracking has emerged as a promising solution for hands-free communication; however, traditional appearance-based interfaces often face challenges such as accuracy issues, involuntary eye movements, and difficulties with extensive command sets. This work presents a multimodal appearance-based gaze-controlled virtual keyboard that utilises deep learning in conjunction with standard camera hardware, incorporating both synchronous and asynchronous modes for command selection. The virtual keyboard application supports menu-based selection with nine commands, enabling users to spell and type up to 56 English characters, including uppercase and lowercase letters, punctuation, and a delete function for corrections. The proposed system was evaluated with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
