# SEGAN: Speech Enhancement Generative Adversarial Network

**Authors:** Santiago Pascual, Antonio Bonafonte, Joan Serr\`a

arXiv: 1703.09452 · 2017-06-12

## TL;DR

This paper introduces SEGAN, a novel end-to-end waveform-level generative adversarial network for speech enhancement that effectively handles multiple speakers and noise conditions, outperforming traditional spectral domain methods.

## Contribution

It is the first to apply GANs directly to waveform-level speech enhancement, sharing parameters across diverse speakers and noise types in a unified model.

## Key findings

- Model trained on 28 speakers and 40 noise conditions
- Effective on unseen speakers and noise conditions
- Both objective and subjective evaluations show improved speech quality

## Abstract

Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of generative adversarial networks for speech enhancement. In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them. We evaluate the proposed model using an independent, unseen test set with two speakers and 20 alternative noise conditions. The enhanced samples confirm the viability of the proposed model, and both objective and subjective evaluations confirm the effectiveness of it. With that, we open the exploration of generative architectures for speech enhancement, which may progressively incorporate further speech-centric design choices to improve their performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1703.09452/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1703.09452/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1703.09452/full.md

---
Source: https://tomesphere.com/paper/1703.09452