BC-VAD: A Robust Bone Conduction Voice Activity Detection
Niccolo' Polvani, Damien Ronssin, Milos Cernak

TL;DR
This paper introduces BC-VAD, a lightweight and robust voice activity detection system using bone conduction microphone input, outperforming larger models in noisy environments while maintaining real-time processing on microcontrollers.
Contribution
The paper presents a novel bone conduction microphone-based VAD that is both efficient and robust against environmental noise, suitable for small devices.
Findings
BC-VAD outperforms larger models on bone conduction data.
BC-VAD maintains real-time processing on microcontrollers.
Smaller BC-VAD achieves better noise robustness than baseline models.
Abstract
Voice Activity Detection (VAD) is a fundamental module in many audio applications. Recent state-of-the-art VAD systems are often based on neural networks, but they require a computational budget that usually exceeds the capabilities of a small battery-operated device when preserving the performance of larger models. In this work, we rely on the input from a bone conduction microphone (BCM) to design an efficient VAD (BC-VAD) robust against residual non-stationary noises originating from the environment or speakers not wearing the BCM.We first show that a larger VAD system (58k parameters) achieves state-of-the-art results on a publicly available benchmark but fails when running on bone conduction signals. We then compare its variant BC-VAD (5k parameters and trained on BC data) with a baseline especially designed for a BCM and show that the proposed method achieves better performances…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
