On the Role of Spatial, Spectral, and Temporal Processing for DNN-based   Non-linear Multi-channel Speech Enhancement

Kristina Tesch; Nils-Hendrik Mohrmann; Timo Gerkmann

arXiv:2206.11181·eess.AS·June 23, 2022

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement

Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann

PDF

Open Access 1 Repo

TL;DR

This paper investigates how deep neural networks process spatial, spectral, and temporal information for multi-channel speech enhancement, highlighting the importance of non-linear spatial filtering and joint processing for improved performance.

Contribution

The study provides experimental insights into the internal mechanisms of DNN-based non-linear filters, emphasizing the benefits of joint spatial, spectral, and temporal processing.

Findings

01

Non-linear spatial filtering outperforms linear filters by 0.24 POLQA score.

02

Joint processing of spectral and temporal information yields a 0.4 POLQA score improvement.

03

Non-linear spatial filtering is crucial for effective speech enhancement.

Abstract

Employing deep neural networks (DNNs) to directly learn filters for multi-channel speech enhancement has potentially two key advantages over a traditional approach combining a linear spatial filter with an independent tempo-spectral post-filter: 1) non-linear spatial filtering allows to overcome potential restrictions originating from a linear processing model and 2) joint processing of spatial and tempo-spectral information allows to exploit interdependencies between different sources of information. A variety of DNN-based non-linear filters have been proposed recently, for which good enhancement performance is reported. However, little is known about the internal mechanisms which turns network architecture design into a game of chance. Therefore, in this paper, we perform experiments to better understand the internal processing of spatial, spectral and temporal information by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sp-uhh/deep-non-linear-filter
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Indoor and Outdoor Localization Technologies · Advanced Adaptive Filtering Techniques