Simultaneous Denoising and Dereverberation Using Deep Embedding Features

Cunhang Fan; Jianhua Tao; Bin Liu; Jiangyan Yi; Zhengqi; Wen

arXiv:2004.02420·eess.AS·April 7, 2020·1 cites

Simultaneous Denoising and Dereverberation Using Deep Embedding Features

Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi, Wen

PDF

Open Access

TL;DR

This paper introduces a joint deep learning approach for simultaneous speech denoising and dereverberation, leveraging deep embedding features and a two-stage process to improve speech quality in noisy, reverberant environments.

Contribution

It proposes a novel joint training method using deep embedding features for combined denoising and dereverberation, outperforming traditional methods especially at low SNRs.

Findings

01

Outperforms WPE and BLSTM baselines in low SNR conditions

02

Uses deep clustering features for effective speech separation

03

Employs a two-stage neural network approach for denoising and dereverberation

Abstract

Monaural speech dereverberation is a very challenging task because no spatial cues can be used. When the additive noises exist, this task becomes more challenging. In this paper, we propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features, which is based on the deep clustering (DC). DC is a state-of-the-art method for speech separation that includes embedding learning and K-means clustering. As for our proposed method, it contains two stages: denoising and dereverberation. At the denoising stage, the DC network is leveraged to extract noise-free deep embedding features. These embedding features are generated from the anechoic speech and residual reverberation signals. They can represent the inferred spectral masking patterns of the desired signals, which are discriminative features. At the dereverberation stage, instead of using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis

Methodsk-Means Clustering