Components Loss for Neural Networks in Mask-Based Speech Enhancement

Ziyi Xu; Samy Elshamy; Ziyue Zhao; Tim Fingscheidt

arXiv:1908.05087·eess.AS·August 15, 2019·6 cites

Components Loss for Neural Networks in Mask-Based Speech Enhancement

Ziyi Xu, Samy Elshamy, Ziyue Zhao, Tim Fingscheidt

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel components loss function for neural network training in mask-based speech enhancement, improving speech quality and residual noise naturalness over traditional loss functions.

Contribution

The paper proposes a new components loss (CL) that separately controls speech preservation, noise suppression, and residual noise naturalness, enhancing speech enhancement performance.

Findings

01

Better PESQ scores and SNR improvements for seen noise types.

02

More natural residual noise and improved perceptual speech quality.

03

Enhanced performance on unseen noise types.

Abstract

Estimating time-frequency domain masks for single-channel speech enhancement using deep learning methods has recently become a popular research field with promising results. In this paper, we propose a novel components loss (CL) for the training of neural networks for mask-based speech enhancement. During the training process, the proposed CL offers separate control over preservation of the speech component quality, suppression of the residual noise component, and preservation of a naturally sounding residual noise component. We illustrate the potential of the proposed CL by evaluating a standard convolutional neural network (CNN) for mask-based speech enhancement. The new CL obtains a better and more balanced performance in almost all employed instrumental quality metrics over the baseline losses, the latter comprising the conventional mean squared error (MSE) loss and also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ifnspaml/Components-Loss
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Advanced Adaptive Filtering Techniques