WatchHand: Enabling Continuous Hand Pose Tracking On Off-the-Shelf Smartwatches
Jiwan Kim, Chi-Jung Lee, Hohurn Jung, Tianhong Catherine Yu, Ruidong Zhang, Ian Oakley, Cheng Zhang

TL;DR
WatchHand introduces a novel method for continuous 3D hand pose tracking on off-the-shelf smartwatches using acoustic signals, enabling expressive interactions without extra hardware.
Contribution
It is the first system to achieve hand pose tracking on commercial smartwatches solely with built-in speaker and microphone, utilizing deep learning for acoustic signal processing.
Findings
Achieves a mean per-joint position error of 7.87 mm in real-world tests.
Performs well across different devices, postures, and noise conditions.
Model adapts effectively with minimal fine-tuning for new users or gestures.
Abstract
Tracking hand poses on wrist-wearables enables rich, expressive interactions, yet remains unavailable on commercial smartwatches, as prior implementations rely on external sensors or custom hardware, limiting their real-world applicability. To address this, we present WatchHand, the first continuous 3D hand pose tracking system implemented on off-the-shelf smartwatches using only their built-in speaker and microphone. WatchHand emits inaudible frequency-modulated continuous waves and captures their reflections from the hand. These acoustic signals are processed by a deep-learning model that estimates 3D hand poses for 20 finger joints. We evaluate WatchHand across diverse real-world conditions -- multiple smartwatch models, wearing-hands, body postures, noise conditions, pose-variation protocols -- and achieve a mean per-joint position error of 7.87 mm in cross-session tests with device…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInteractive and Immersive Displays · Hand Gesture Recognition Systems · Indoor and Outdoor Localization Technologies
