Microphone Array Signal Processing and Deep Learning for Speech   Enhancement

Reinhold Haeb-Umbach; Tomohiro Nakatani; Marc Delcroix; Christoph; Boeddeker; Tsubasa Ochiai

arXiv:2501.07215·eess.AS·January 14, 2025

Microphone Array Signal Processing and Deep Learning for Speech Enhancement

Reinhold Haeb-Umbach, Tomohiro Nakatani, Marc Delcroix, Christoph, Boeddeker, Tsubasa Ochiai

PDF

TL;DR

This paper compares model-based, data-driven, and hybrid methods for multi-channel speech enhancement, highlighting how combining these approaches can improve noise reduction, source separation, and dereverberation.

Contribution

It introduces a hybrid approach that leverages both model-based and deep learning techniques for improved parameter estimation in speech enhancement.

Findings

01

Hybrid methods outperform purely model-based or data-driven approaches.

02

Deep learning enhances the estimation of spatial filtering parameters.

03

Hybrid approaches effectively address noise and reverberation challenges.

Abstract

Multi-channel acoustic signal processing is a well-established and powerful tool to exploit the spatial diversity between a target signal and non-target or noise sources for signal enhancement. However, the textbook solutions for optimal data-dependent spatial filtering rest on the knowledge of second-order statistical moments of the signals, which have traditionally been difficult to acquire. In this contribution, we compare model-based, purely data-driven, and hybrid approaches to parameter estimation and filtering, where the latter tries to combine the benefits of model-based signal processing and data-driven deep learning to overcome their individual deficiencies. We illustrate the underlying design principles with examples from noise reduction, source separation, and dereverberation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.