Speech enhancement with mixture-of-deep-experts with clean clustering   pre-training

Shlomo E. Chazan; Jacob Goldberger; Sharon Gannot

arXiv:2102.06034·cs.SD·February 12, 2021

Speech enhancement with mixture-of-deep-experts with clean clustering pre-training

Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

PDF

TL;DR

This paper introduces a mixture-of-deep-experts neural network architecture for single microphone speech enhancement, utilizing specialized experts and a gating mechanism to improve robustness and reduce complexity.

Contribution

The novel MoDE architecture employs multiple expert DNNs with a gating network for improved speech enhancement and noise robustness, with clean clustering pre-training enhancing performance.

Findings

01

Enhanced speech quality with MoDE architecture

02

Improved robustness to unfamiliar noise types

03

Reduced computational complexity during testing

Abstract

In this study we present a mixture of deep experts (MoDE) neural-network architecture for single microphone speech enhancement. Our architecture comprises a set of deep neural networks (DNNs), each of which is an 'expert' in a different speech spectral pattern such as phoneme. A gating DNN is responsible for the latent variables which are the weights assigned to each expert's output given a speech segment. The experts estimate a mask from the noisy input and the final mask is then obtained as a weighted average of the experts' estimates, with the weights determined by the gating DNN. A soft spectral attenuation, based on the estimated mask, is then applied to enhance the noisy speech signal. As a byproduct, we gain reduction at the complexity in test time. We show that the experts specialization allows better robustness to unfamiliar noise types.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.