Lip-Siri: Contactless Open-Sentence Silent Speech with Wi-Fi Backscatter

Ye Tian; Haohua Du; Chao Gu; Junyang Zhang; Shanyue Wang; Hao Zhou; Jiahui Hou; and Xiang-Yang Li

arXiv:2601.18177·cs.HC·January 27, 2026

Lip-Siri: Contactless Open-Sentence Silent Speech with Wi-Fi Backscatter

Ye Tian, Haohua Du, Chao Gu, Junyang Zhang, Shanyue Wang, Hao Zhou, Jiahui Hou, and Xiang-Yang Li

PDF

Open Access

TL;DR

Lip-Siri introduces a Wi-Fi backscatter-based silent speech interface capable of recognizing open-vocabulary sentences by decoding lip motions, offering a contactless, privacy-preserving, and energy-efficient communication method.

Contribution

This work is the first to enable open-vocabulary silent speech recognition using Wi-Fi backscatter and a novel lexicon-guided decoding approach.

Findings

01

Achieves 85.61% word prediction accuracy

02

Attains 36.87% word error rate in sentence recognition

03

Demonstrates reliable lip-motion extraction from Wi-Fi signals

Abstract

Silent speech interfaces (SSIs) enable silent interaction in noise-sensitive or privacy-sensitive settings. However, existing SSIs face practical deployment trade-offs among privacy, user experience, and energy consumption, and most remain limited to closed-set recognition over small, pre-defined vocabularies of words or sentences, which restricts real-world expressiveness. In this paper, we present Lip-Siri, to the best of our knowledge, the first Wi-Fi backscatter--based SSI that supports open-vocabulary sentence recognition via lexicon-guided subword decoding. Lip-Siri designs a frequency-shifted backscatter tag to isolate tag-modulated reflections and suppress interference from non-target motions, enabling reliable extraction of lip-motion traces from ubiquitous Wi-Fi signals. We then segment continuous traces into lip-motion units, cluster them, learn robust unit representations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Face recognition and analysis · Speech Recognition and Synthesis