# GELP: GAN-Excited Linear Prediction for Speech Synthesis from   Mel-spectrogram

**Authors:** Lauri Juvela, Bajibabu Bollepalli, Junichi Yamagishi, Paavo Alku

arXiv: 1904.03976 · 2019-06-27

## TL;DR

This paper introduces GELP, a neural vocoder combining GAN-based training and linear predictive filtering, achieving faster inference and better quality than WaveNet in speech synthesis from mel-spectrograms.

## Contribution

It presents a novel GAN-based parallel vocoder with integrated linear prediction, improving inference speed and synthesis quality over existing models.

## Key findings

- Significant speed-up in inference time.
- Outperforms WaveNet in copy-synthesis quality.
- Effective integration of linear predictive filtering.

## Abstract

Recent advances in neural network -based text-to-speech have reached human level naturalness in synthetic speech. The present sequence-to-sequence models can directly map text to mel-spectrogram acoustic features, which are convenient for modeling, but present additional challenges for vocoding (i.e., waveform generation from the acoustic features). High-quality synthesis can be achieved with neural vocoders, such as WaveNet, but such autoregressive models suffer from slow sequential inference. Meanwhile, their existing parallel inference counterparts are difficult to train and require increasingly large model sizes. In this paper, we propose an alternative training strategy for a parallel neural vocoder utilizing generative adversarial networks, and integrate a linear predictive synthesis filter into the model. Results show that the proposed model achieves significant improvement in inference speed, while outperforming a WaveNet in copy-synthesis quality.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.03976/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1904.03976/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1904.03976/full.md

---
Source: https://tomesphere.com/paper/1904.03976