A Bin Encoding Training of a Spiking Neural Network-based Voice Activity Detection
Giorgia Dellaferrera, Flavio Martinelli, Milos Cernak

TL;DR
This paper introduces a bin encoding method for spiking neural networks to perform voice activity detection with ultra low power consumption and state-of-the-art accuracy.
Contribution
It proposes a novel bin encoding scheme for SNNs and demonstrates its effectiveness in low-power VAD applications.
Findings
Achieves VAD with only 3.8μW power consumption.
Uses bin encoding to convert log mel filterbank bins into spike patterns.
Attains state-of-the-art performance on QUT-NOISE-TIMIT corpus.
Abstract
Advances of deep learning for Artificial Neural Networks(ANNs) have led to significant improvements in the performance of digital signal processing systems implemented on digital chips. Although recent progress in low-power chips is remarkable, neuromorphic chips that run Spiking Neural Networks (SNNs) based applications offer an even lower power consumption, as a consequence of the ensuing sparse spike-based coding scheme. In this work, we develop a SNN-based Voice Activity Detection (VAD) system that belongs to the building blocks of any audio and speech processing system. We propose to use the bin encoding, a novel method to convert log mel filterbank bins of single-time frames into spike patterns. We integrate the proposed scheme in a bilayer spiking architecture which was evaluated on the QUT-NOISE-TIMIT corpus. Our approach shows that SNNs enable an ultra low-power implementation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
