Concatenated Identical DNN (CI-DNN) to Reduce Noise-Type Dependence in   DNN-Based Speech Enhancement

Ziyi Xu; Maximilian Strake; Tim Fingscheidt

arXiv:1810.11217·eess.AS·October 29, 2018·EUSIPCO·1 cites

Concatenated Identical DNN (CI-DNN) to Reduce Noise-Type Dependence in DNN-Based Speech Enhancement

Ziyi Xu, Maximilian Strake, Tim Fingscheidt

PDF

Open Access

TL;DR

This paper introduces a concatenated identical DNN framework for speech enhancement that improves noise robustness and speech quality, outperforming traditional methods and generalizing well to unseen noise types.

Contribution

The paper proposes a novel CI-DNN architecture trained under multiple SNR conditions, enhancing noise reduction and speech quality with fewer parameters and better generalization.

Findings

01

Outperforms classical spectral weighting in speech quality and intelligibility

02

Achieves similar or better performance with fewer trainable parameters

03

Generalizes better to unseen noise types compared to other deep learning approaches

Abstract

Estimating time-frequency domain masks for speech enhancement using deep learning approaches has recently become a popular field of research. In this paper, we propose a mask-based speech enhancement framework by using concatenated identical deep neural networks (CI-DNNs). The idea is that a single DNN is trained under multiple input and output signal-to-noise power ratio (SNR) conditions, using targets that provide a moderate SNR gain with respect to the input and therefore achieve a balance between speech component quality and noise suppression. We concatenate this single DNN several times without any retraining to provide enough noise attenuation. Simulation results show that our proposed CI-DNN outperforms enhancement methods using classical spectral weighting rules w.r.t. total speech quality and speech intelligibility. Moreover, our approach shows similar or even a little bit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Advanced Adaptive Filtering Techniques