Tiny Noise-Robust Voice Activity Detector for Voice Assistants
Hamed Jafarzadeh Asl, Mahsa Ghazvini Nejad, Amin Edraki, Masoud Asgharian, Vahid Partovi Nia

TL;DR
This paper introduces a lightweight noise-robust voice activity detector designed for on-device voice assistants, significantly improving accuracy in noisy environments without increasing model size or requiring fine-tuning.
Contribution
The paper presents a novel, lightweight VAD with added pre- and post-processing modules that enhance noise robustness without enlarging the model or needing fine-tuning.
Findings
Achieves higher accuracy in noisy environments compared to baseline models.
Maintains performance in clean speech detection.
Does not increase model complexity or require additional training.
Abstract
Voice Activity Detection (VAD) in the presence of background noise remains a challenging problem in speech processing. Accurate VAD is essential in automatic speech recognition, voice-to-text, conversational agents, etc, where noise can severely degrade the performance. A modern application includes the voice assistant, specially mounted on Artificial Intelligence of Things (AIoT) devices such as cell phones, smart glasses, earbuds, etc, where the voice signal includes background noise. Therefore, VAD modules must remain light-weight due to their practical on-device limitation. The existing models often struggle with low signal-to-noise ratios across diverse acoustic environments. A simple VAD often detects human voice in a clean environment, but struggles to detect the human voice in noisy conditions. We propose a noise-robust VAD that comprises a light-weight VAD, with data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
