A Hybrid Approach for Speech Enhancement Using MoG Model and Neural   Network Phoneme Classifier

Shlomo E. Chazan; Jacob Goldberger; Sharon Gannot

arXiv:1510.07315·cs.SD·October 27, 2015

A Hybrid Approach for Speech Enhancement Using MoG Model and Neural Network Phoneme Classifier

Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

PDF

Open Access

TL;DR

This paper introduces a hybrid speech enhancement method combining a generative MoG model with a discriminative neural network to improve speech quality and recognition accuracy in single-microphone scenarios.

Contribution

It presents a novel two-phase hybrid approach integrating MoG and neural network models for more effective speech enhancement.

Findings

01

Significant improvement in speech quality measures.

02

Enhanced speech recognition accuracy.

03

Effective noise suppression in real-world conditions.

Abstract

In this paper we present a single-microphone speech enhancement algorithm. A hybrid approach is proposed merging the generative mixture of Gaussians (MoG) model and the discriminative neural network (NN). The proposed algorithm is executed in two phases, the training phase, which does not recur, and the test phase. First, the noise-free speech power spectral density (PSD) is modeled as a MoG, representing the phoneme based diversity in the speech signal. An NN is then trained with phoneme labeled database for phoneme classification with mel-frequency cepstral coefficients (MFCC) as the input features. Given the phoneme classification results, a speech presence probability (SPP) is obtained using both the generative and discriminative models. Soft spectral subtraction is then executed while simultaneously, the noise estimation is updated. The discriminative NN maintain the continuity of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Music and Audio Processing