Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks
Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

TL;DR
This paper introduces an end-to-end complex recurrent convolutional network for binaural speech enhancement, improving speech intelligibility, noise reduction, and spatial information preservation in binaural hearing devices.
Contribution
The paper proposes a novel complex convolutional recurrent network with an encoder-decoder structure and a complex LSTM, specifically designed for binaural speech enhancement.
Findings
Significant improvement in speech intelligibility over baseline algorithms.
Effective noise reduction while preserving spatial cues.
Robust performance across various noise types and acoustic conditions.
Abstract
From hearing aids to augmented and virtual reality devices, binaural speech enhancement algorithms have been established as state-of-the-art techniques to improve speech intelligibility and listening comfort. In this paper, we present an end-to-end binaural speech enhancement method using a complex recurrent convolutional network with an encoder-decoder architecture and a complex LSTM recurrent block placed between the encoder and decoder. A loss function that focuses on the preservation of spatial information in addition to speech intelligibility improvement and noise reduction is introduced. The network estimates individual complex ratio masks for the left and right-ear channels of a binaural hearing device in the time-frequency domain. We show that, compared to other baseline algorithms, the proposed method significantly improves the estimated speech intelligibility and reduces the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
