# Analysing Deep Learning-Spectral Envelope Prediction Methods for Singing   Synthesis

**Authors:** Frederik Bous, Axel Roebel

arXiv: 1903.01161 · 2019-07-01

## TL;DR

This paper investigates neural network hyper-parameters for spectral envelope prediction in singing synthesis, demonstrating that 2D convolutions and iterative frame prediction improve quality, leading to a new superior architecture.

## Contribution

It introduces a new neural network architecture with optimized hyper-parameters for spectral envelope prediction in singing synthesis, outperforming existing methods.

## Key findings

- 2D convolutions outperform 1D convolutions in spectral envelope prediction.
- Iterative multi-frame prediction yields better results than input noise injection.
- The proposed architecture produces better synthesis quality than state-of-the-art methods.

## Abstract

We conduct an investigation on various hyper-parameters regarding neural networks used to generate spectral envelopes for singing synthesis. Two perceptive tests, where the first compares two models directly and the other ranks models with a mean opinion score, are performed. With these tests we show that when learning to predict spectral envelopes, 2d-convolutions are superior over previously proposed 1d-convolutions and that predicting multiple frames in an iterated fashion during training is superior over injecting noise to the input data. An experimental investigation whether learning to predict a probability distribution vs.\ single samples was performed but turned out to be inconclusive. A network architecture is proposed that incorporates the improvements which we found to be useful and we show in our experiments that this network produces better results than other stat-of-the-art methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.01161/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1903.01161/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1903.01161/full.md

---
Source: https://tomesphere.com/paper/1903.01161