Word-Level Motion Learning for Contactless QWERTY Typing with a Single Camera
Sung-Sic Yoo, Heung-Shik Lee

TL;DR
This paper introduces a new method for contactless typing using a single camera by recognizing whole words based on finger motion patterns.
Contribution
The novelty lies in modeling word-level typing as spatiotemporal motion patterns using hand joint trajectories, enabling robust recognition with a single camera.
Findings
The proposed framework achieves stable word-level typing recognition using motion prototypes learned through repeated interaction.
Motion representations transfer effectively from physical keyboards to flat surfaces, even with reduced tactile and visual cues.
The method shows potential as a complement to character-based input systems in monocular sensing environments.
Abstract
Contactless text entry is increasingly important in immersive and constrained computing environments, yet most vision-based approaches rely on character-level recognition or key localization, which are fragile under monocular sensing. This study investigates the feasibility of recognizing natural QWERTY typing motions directly at the word level using only a single RGB camera, under a fixed single-user and single-camera configuration. We propose a word-level contactless typing framework that models each word as a distinctive spatiotemporal finger motion pattern derived from hand joint trajectories. Typing motions are temporally segmented, and direction-aware finger displacements are accumulated to construct compact motion representations that are relatively insensitive to absolute hand position and typing duration within the evaluated setup. Each word is represented by multiple motion…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Hand Gesture Recognition Systems · Gaze Tracking and Assistive Technology
