# WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the   Wasserstein-GAN

**Authors:** Pritish Chandna, Merlijn Blaauw, Jordi Bonada, Emilia Gomez

arXiv: 1903.10729 · 2020-02-13

## TL;DR

This paper introduces WGANSing, a deep neural network based on Wasserstein-GAN for multi-voice singing voice synthesis, effectively modeling pitch and timbre variability with competitive quality.

## Contribution

The paper proposes a novel GAN-based singing voice synthesizer that uses vocoder parameters and block-wise processing to improve temporal modeling and synthesis quality.

## Key findings

- Competitive performance with state-of-the-art methods
- Objective metrics and listening tests confirm quality
- Open-source implementation available on GitHub

## Abstract

We present a deep neural network based singing voice synthesizer, inspired by the Deep Convolutions Generative Adversarial Networks (DCGAN) architecture and optimized using the Wasserstein-GAN algorithm. We use vocoder parameters for acoustic modelling, to separate the influence of pitch and timbre. This facilitates the modelling of the large variability of pitch in the singing voice. Our network takes a block of consecutive frame-wise linguistic and fundamental frequency features, along with global singer identity as input and outputs vocoder features, corresponding to the block of features. This block-wise approach, along with the training methodology allows us to model temporal dependencies within the features of the input block. For inference, sequential blocks are concatenated using an overlap-add procedure. We show that the performance of our model is competitive with regards to the state-of-the-art and the original sample using objective metrics and a subjective listening test. We also present examples of the synthesis on a supplementary website and the source code via GitHub.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1903.10729/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1903.10729/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1903.10729/full.md

---
Source: https://tomesphere.com/paper/1903.10729