# A Statistically Principled and Computationally Efficient Approach to   Speech Enhancement using Variational Autoencoders

**Authors:** Manuel Pariente (MULTISPEECH), Antoine Deleforge (MULTISPEECH),, Emmanuel Vincent (MULTISPEECH)

arXiv: 1905.01209 · 2019-05-15

## TL;DR

This paper introduces a variational inference approach for speech enhancement using VAEs that achieves comparable results to sampling-based methods but with significantly reduced computational cost.

## Contribution

It provides an analytical derivation of variational steps enabling efficient speech enhancement with VAEs, reducing computational complexity substantially.

## Key findings

- Achieves speech enhancement performance comparable to sampling-based methods.
- Reduces computational cost by a factor of 36.
- Uses the VAE encoder for efficient variational approximation.

## Abstract

Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement. These studies developed iterative algorithms involving either Gibbs sampling or gradient descent at each step, making them computationally expensive. This paper proposes a variational inference method to iteratively estimate the power spectrogram of the clean speech. Our main contribution is the analytical derivation of the variational steps in which the en-coder of the pre-learned VAE can be used to estimate the varia-tional approximation of the true posterior distribution, using the very same assumption made to train VAEs. Experiments show that the proposed method produces results on par with the afore-mentioned iterative methods using sampling, while decreasing the computational cost by a factor 36 to reach a given performance .

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01209/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01209/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/1905.01209/full.md

---
Source: https://tomesphere.com/paper/1905.01209