Gesture2Text: A Generalizable Decoder for Word-Gesture Keyboards in XR Through Trajectory Coarse Discretization and Pre-training
Junxiao Shen, Khadija Khaldi, Enmin Zhou, Hemant Bhaskar Surale, and, Amy Karlson

TL;DR
This paper introduces a pre-trained neural decoder for word-gesture keyboards in XR that achieves high accuracy, generalizes across different systems, and operates efficiently in real-time, outperforming existing methods.
Contribution
It presents a novel pre-training approach on coarse gesture data, enabling a generalizable, accurate, and lightweight neural decoder for XR word-gesture keyboards.
Findings
Achieves 90.4% Top-4 accuracy across four datasets.
Outperforms SHARK^2 by 37.2% in accuracy.
Operates in 97 ms on Quest 3 with only 4 MB size.
Abstract
Text entry with word-gesture keyboards (WGK) is emerging as a popular method and becoming a key interaction for Extended Reality (XR). However, the diversity of interaction modes, keyboard sizes, and visual feedback in these environments introduces divergent word-gesture trajectory data patterns, thus leading to complexity in decoding trajectories into text. Template-matching decoding methods, such as SHARK^2, are commonly used for these WGK systems because they are easy to implement and configure. However, these methods are susceptible to decoding inaccuracies for noisy trajectories. While conventional neural-network-based decoders (neural decoders) trained on word-gesture trajectory data have been proposed to improve accuracy, they have their own limitations: they require extensive data for training and deep-learning expertise for implementation. To address these challenges, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Tactile and Sensory Interactions · Interactive and Immersive Displays
