DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement
Tao Sun, Sander Boht\'e

TL;DR
This paper introduces DPSNN, a low-latency spiking neural network framework for streaming speech enhancement, combining convolutional and recurrent SNNs to achieve real-time performance with high quality and energy efficiency.
Contribution
The paper proposes a novel two-phase streaming SNN architecture inspired by classical dual-path models, reducing latency to about 5ms for speech enhancement applications.
Findings
Achieves approximately 5ms latency suitable for hearing aids
Demonstrates high SNR and perceptual quality in speech enhancement
Enhances energy efficiency through regularizer-based activation suppression
Abstract
Speech enhancement (SE) improves communication in noisy environments, affecting areas such as automatic speech recognition, hearing aids, and telecommunications. With these domains typically being power-constrained and event-based while requiring low latency, neuromorphic algorithms in the form of spiking neural networks (SNNs) have great potential. Yet, current effective SNN solutions require a contextual sampling window imposing substantial latency, typically around 32ms, too long for many applications. Inspired by Dual-Path Spiking Neural Networks (DPSNNs) in classical neural networks, we develop a two-phase time-domain streaming SNN framework -- the Dual-Path Spiking Neural Network (DPSNN). In the DPSNN, the first phase uses Spiking Convolutional Neural Networks (SCNNs) to capture global contextual information, while the second phase uses Spiking Recurrent Neural Networks (SRNNs) to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research
MethodsSpiking Neural Networks · Focus
