A Flow-Based Neural Network for Time Domain Speech Enhancement

Martin Strauss; Bernd Edler

arXiv:2106.09008·eess.AS·June 17, 2021

A Flow-Based Neural Network for Time Domain Speech Enhancement

Martin Strauss, Bernd Edler

PDF

TL;DR

This paper introduces a flow-based neural network model for time domain speech enhancement, adapting WaveGlow for direct noisy speech enhancement and demonstrating competitive results with state-of-the-art methods.

Contribution

The paper presents a novel flow-based framework for speech enhancement that directly models clean speech conditioned on noisy input, using adapted WaveGlow and input companding techniques.

Findings

01

Achieves comparable results to GAN-based methods

02

Surpasses baseline models on objective metrics

03

Demonstrates effectiveness of nonlinear input companding

Abstract

Speech enhancement involves the distinction of a target speech signal from an intrusive background. Although generative approaches using Variational Autoencoders or Generative Adversarial Networks (GANs) have increasingly been used in recent years, normalizing flow (NF) based systems are still scarse, despite their success in related fields. Thus, in this paper we propose a NF framework to directly model the enhancement process by density estimation of clean speech utterances conditioned on their noisy counterpart. The WaveGlow model from speech synthesis is adapted to enable direct enhancement of noisy utterances in time domain. In addition, we demonstrate that nonlinear input companding benefits the model performance by equalizing the distribution of input samples. Experimental evaluation on a publicly available dataset shows comparable results to current state-of-the-art GAN-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInvertible 1x1 Convolution · Affine Coupling · Normalizing Flows · WaveGlow