We Can Hear You with mmWave Radar! An End-to-End Eavesdropping System
Dachao Han, Teng Huang, Han Ding, Cui Zhao, Fei Wang, Ge Wang, Wei Xi

TL;DR
This paper introduces mmSpeech, a novel mmWave radar-based system capable of eavesdropping on speech through walls by reconstructing intelligible audio from vibrations, without invasive access or prior speaker knowledge.
Contribution
The paper presents an end-to-end mmWave eavesdropping system that reconstructs speech from vibrations, including optimal sampling, neural network design, and training pipeline innovations.
Findings
Achieves state-of-the-art speech quality in eavesdropping.
Generalizes well across unseen speakers and conditions.
Operates effectively through walls using commercial mmWave radar.
Abstract
With the rise of voice-enabled technologies, loudspeaker playback has become widespread, posing increasing risks to speech privacy. Traditional eavesdropping methods often require invasive access or line-of-sight, limiting their practicality. In this paper, we present mmSpeech, an end-to-end mmWave-based eavesdropping system that reconstructs intelligible speech solely from vibration signals induced by loudspeaker playback, even through walls and without prior knowledge of the speaker. To achieve this, we reveal an optimal combination of vibrating material and radar sampling rate for capturing high-quality vibrations using narrowband mmWave signals. We then design a deep neural network that reconstructs intelligible speech from the estimated noisy spectrograms. To further support downstream speech understanding, we introduce a synthetic training pipeline and selectively fine-tune the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Wireless Signal Modulation Classification · Speech Recognition and Synthesis
