An Ultra-low Power RNN Classifier for Always-On Voice Wake-Up Detection Robust to Real-World Scenarios
Emmanuel Hardy, Franck Badets

TL;DR
This paper introduces an ultra-low power RNN-based voice wake-up sensor that is highly robust to real-world noise, significantly reducing power consumption for always-on speech applications while maintaining high accuracy.
Contribution
The authors designed a robust, low-power RNN wake-up sensor trained specifically for real-world noise conditions, outperforming existing methods in power efficiency and noise robustness.
Findings
Less than 3% No Trigger Rate in noisy environments
Power consumption of 45 nW for the RNN in 65nm CMOS
Memory footprint of only 0.52 kB
Abstract
We present in this paper an ultra-low power (ULP) Recurrent Neural Network (RNN) based classifier for an always-on voice Wake-Up Sensor (WUS) with performances suitable for real-world applications. The purpose of our sensor is to bring down by at least a factor 100 the power consumption in background noise of always-on speech processing algorithms such as Automatic Speech Recognition, Keyword Spotting, Speaker Verification, etc. Unlike the other published approaches, we designed our wake-up sensor to be robust to unseen real-world noises for realistic levels of speech and noise by carefully designing the dataset and the loss function. We also specifically trained it to mark only the speech start rather than adopting a traditional Voice Activity Detection (VAD) approach. We achieve less than 3% No Trigger Rate (NTR) for a duty cycle less than 1% in challenging background noises pooled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
