TouchInsight: Uncertainty-aware Rapid Touch and Text Input for Mixed Reality from Egocentric Vision
Paul Streli, Mark Richardson, Fadi Botros, Shugao Ma, Robert Wang,, Christian Holz

TL;DR
TouchInsight is a real-time, uncertainty-aware system that detects touch input from egocentric vision, enabling accurate and natural interaction in mixed reality environments.
Contribution
This paper introduces a novel neural network-based pipeline that predicts touch events, locations, and fingers from egocentric hand tracking, accounting for sensing uncertainties.
Findings
Mean touch location error of 6.3 mm
Touch detection F1 score of 0.99
Participants typed 37 words per minute with 2.9% error rate
Abstract
While passive surfaces offer numerous benefits for interaction in mixed reality, reliably detecting touch input solely from head-mounted cameras has been a long-standing challenge. Camera specifics, hand self-occlusion, and rapid movements of both head and fingers introduce considerable uncertainty about the exact location of touch events. Existing methods have thus not been capable of achieving the performance needed for robust interaction. In this paper, we present a real-time pipeline that detects touch input from all ten fingers on any physical surface, purely based on egocentric hand tracking. Our method TouchInsight comprises a neural network to predict the moment of a touch event, the finger making contact, and the touch location. TouchInsight represents locations through a bivariate Gaussian distribution to account for uncertainties due to sensing inaccuracies, which we resolve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
