RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a   Deep-Learning Word Prediction Framework

Junxiao Shen; Roger Boldu; Arpit Kalla; Michael Glueck; Hemant Bhaskar; Surale Amy Karlson

arXiv:2410.18100·cs.CV·October 25, 2024

RingGesture: A Ring-Based Mid-Air Gesture Typing System Powered by a Deep-Learning Word Prediction Framework

Junxiao Shen, Roger Boldu, Arpit Kalla, Michael Glueck, Hemant Bhaskar, Surale Amy Karlson

PDF

Open Access

TL;DR

RingGesture introduces a ring-based mid-air gesture typing system for lightweight AR glasses, combining electrode and IMU sensors with a deep-learning word prediction framework to improve text entry speed and accuracy.

Contribution

This work presents a novel ring-based gesture input method and a deep-learning score fusion framework that significantly enhances AR text entry performance.

Findings

01

Achieves an average text entry speed of 27.3 WPM

02

Score Fusion reduces Character Error Rate by 28.2%

03

System usability score of 83 indicates high usability

Abstract

Text entry is a critical capability for any modern computing experience, with lightweight augmented reality (AR) glasses being no exception. Designed for all-day wearability, a limitation of lightweight AR glass is the restriction to the inclusion of multiple cameras for extensive field of view in hand tracking. This constraint underscores the need for an additional input device. We propose a system to address this gap: a ring-based mid-air gesture typing technique, RingGesture, utilizing electrodes to mark the start and end of gesture trajectories and inertial measurement units (IMU) sensors for hand tracking. This method offers an intuitive experience similar to raycast-based mid-air gesture typing found in VR headsets, allowing for a seamless translation of hand movements into cursor navigation. To enhance both accuracy and input speed, we propose a novel deep-learning word…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHand Gesture Recognition Systems · Speech and dialogue systems · Hearing Impairment and Communication

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings