Microphone Array Generalization for Multichannel Narrowband Deep Speech   Enhancement

Siyuan Zhang; Xiaofei Li

arXiv:2107.12601·eess.AS·July 28, 2021

Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement

Siyuan Zhang, Xiaofei Li

PDF

Open Access 1 Repo

TL;DR

This paper proposes a universal deep neural network for multichannel speech enhancement that generalizes across different microphone array geometries, outperforming traditional methods and previous deep learning approaches.

Contribution

A novel training approach using diverse microphone array data enables a single DNN to generalize across various array geometries for speech enhancement.

Findings

01

Networks perform well on unseen microphone arrays.

02

Outperform beamforming and other deep learning methods.

03

Maintain high performance across simulated and real room responses.

Abstract

This paper addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique deep neural network (DNN) potentially performing well on unseen microphone arrays. The microphone array geometry shapes the network's parameters when training on a fixed microphone array, and thus restricts the generalization of the trained network to another microphone array. To resolve this problem, a single network is trained using data recorded by various microphone arrays of different geometries. We design three variants of our recently proposed narrowband network to cope with the agnostic number of microphones. Overall, the goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. The experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RusselZHANG/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Hearing Loss and Rehabilitation