Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement
Siyuan Zhang, Xiaofei Li

TL;DR
This paper proposes a universal deep neural network for multichannel speech enhancement that generalizes across different microphone array geometries, outperforming traditional methods and previous deep learning approaches.
Contribution
A novel training approach using diverse microphone array data enables a single DNN to generalize across various array geometries for speech enhancement.
Findings
Networks perform well on unseen microphone arrays.
Outperform beamforming and other deep learning methods.
Maintain high performance across simulated and real room responses.
Abstract
This paper addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique deep neural network (DNN) potentially performing well on unseen microphone arrays. The microphone array geometry shapes the network's parameters when training on a fixed microphone array, and thus restricts the generalization of the trained network to another microphone array. To resolve this problem, a single network is trained using data recorded by various microphone arrays of different geometries. We design three variants of our recently proposed narrowband network to cope with the agnostic number of microphones. Overall, the goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. The experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Hearing Loss and Rehabilitation
