Multimodal Speech Enhancement Using Burst Propagation
Mohsin Raza, Leandro A. Passos, Ahmed Khubaib, Ahsan Adeel

TL;DR
This paper introduces MBURST, a biologically inspired multimodal speech enhancement method that improves noise suppression and energy efficiency, making it suitable for embedded hearing aid devices.
Contribution
The paper presents MBURST, a novel biologically plausible neural network approach for audio-visual speech enhancement that significantly reduces energy consumption while maintaining performance.
Findings
MBURST achieves comparable speech mask reconstruction to baseline methods.
MBURST reduces neuron firing rates by up to 70%.
The method is energy-efficient and suitable for embedded systems.
Abstract
This paper proposes the MBURST, a novel multimodal solution for audio-visual speech enhancements that consider the most recent neurological discoveries regarding pyramidal cells of the prefrontal cortex and other brain regions. The so-called burst propagation implements several criteria to address the credit assignment problem in a more biologically plausible manner: steering the sign and magnitude of plasticity through feedback, multiplexing the feedback and feedforward information across layers through different weight connections, approximating feedback and feedforward connections, and linearizing the feedback signals. MBURST benefits from such capabilities to learn correlations between the noisy signal and the visual stimuli, thus attributing meaning to the speech by amplifying relevant information and suppressing noise. Experiments conducted over a Grid Corpus and CHiME3-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Speech Recognition and Synthesis
