# Speech enhancement with variational autoencoders and alpha-stable   distributions

**Authors:** Simon Leglaive, Umut Simsekli, Antoine Liutkus, Laurent Girin, Radu, Horaud

arXiv: 1902.03926 · 2019-05-01

## TL;DR

This paper introduces a novel semi-supervised speech enhancement method using variational autoencoders combined with alpha-stable noise models, improving speech quality and intelligibility without prior noise environment knowledge.

## Contribution

It proposes a new noise model based on alpha-stable distributions within a variational autoencoder framework for speech enhancement, replacing traditional Gaussian models.

## Key findings

- Outperforms Gaussian-based models in perceptual quality
- Improves speech intelligibility in noisy conditions
- Effective in semi-supervised settings

## Abstract

This paper focuses on single-channel semi-supervised speech enhancement. We learn a speaker-independent deep generative speech model using the framework of variational autoencoders. The noise model remains unsupervised because we do not assume prior knowledge of the noisy recording environment. In this context, our contribution is to propose a noise model based on alpha-stable distributions, instead of the more conventional Gaussian non-negative matrix factorization approach found in previous studies. We develop a Monte Carlo expectation-maximization algorithm for estimating the model parameters at test time. Experimental results show the superiority of the proposed approach both in terms of perceptual quality and intelligibility of the enhanced speech signal.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.03926/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1902.03926/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/1902.03926/full.md

---
Source: https://tomesphere.com/paper/1902.03926