Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks

Vikas Tokala; Eric Grinstein; Mike Brookes; Simon Doclo; Jesper Jensen; Patrick A. Naylor

arXiv:2507.20023·eess.AS·July 29, 2025·ACSSC

Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks

Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

PDF

TL;DR

This paper introduces an end-to-end complex recurrent convolutional network for binaural speech enhancement, improving speech intelligibility, noise reduction, and spatial information preservation in binaural hearing devices.

Contribution

The paper proposes a novel complex convolutional recurrent network with an encoder-decoder structure and a complex LSTM, specifically designed for binaural speech enhancement.

Findings

01

Significant improvement in speech intelligibility over baseline algorithms.

02

Effective noise reduction while preserving spatial cues.

03

Robust performance across various noise types and acoustic conditions.

Abstract

From hearing aids to augmented and virtual reality devices, binaural speech enhancement algorithms have been established as state-of-the-art techniques to improve speech intelligibility and listening comfort. In this paper, we present an end-to-end binaural speech enhancement method using a complex recurrent convolutional network with an encoder-decoder architecture and a complex LSTM recurrent block placed between the encoder and decoder. A loss function that focuses on the preservation of spatial information in addition to speech intelligibility improvement and noise reduction is introduced. The network estimates individual complex ratio masks for the left and right-ear channels of a binaural hearing device in the time-frequency domain. We show that, compared to other baseline algorithms, the proposed method significantly improves the estimated speech intelligibility and reduces the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.