Unsupervised Speech Enhancement using Dynamical Variational   Auto-Encoders

Xiaoyu Bie; Simon Leglaive; Xavier Alameda-Pineda; Laurent Girin

arXiv:2106.12271·cs.SD·October 4, 2022

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders

Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin

PDF

1 Repo

TL;DR

This paper introduces a novel unsupervised speech enhancement method using dynamical variational autoencoders (DVAEs) that models temporal dependencies and outperforms existing approaches, especially on unseen noise types.

Contribution

It extends VAE-based speech enhancement to DVAEs, combining speech dynamics modeling with unsupervised learning for improved noise robustness.

Findings

01

DVAE-based method outperforms VAE-based and baseline methods

02

Effective on unseen noise types

03

Versatile framework with three DVAE models

Abstract

Dynamical variational autoencoders (DVAEs) are a class of deep generative models with latent variables, dedicated to model time series of high-dimensional data. DVAEs can be considered as extensions of the variational autoencoder (VAE) that include temporal dependencies between successive observed and/or latent vectors. Previous work has shown the interest of using DVAEs over the VAE for speech spectrograms modeling. Independently, the VAE has been successfully applied to speech enhancement in noise, in an unsupervised noise-agnostic set-up that requires neither noise samples nor noisy speech samples at training time, but only requires clean speech signals. In this paper, we extend these works to DVAE-based single-channel unsupervised speech enhancement, hence exploiting both speech signals unsupervised representation learning and dynamics modeling. We propose an unsupervised speech…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xiaoyubie1994/dvae_se
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.