Improve GAN-based Neural Vocoder using Pointwise Relativistic LeastSquare GAN
Congyi Wang, Yu Chen, Bin Wang, Yi Shi

TL;DR
This paper introduces PRLSGAN, a novel relativistic LSGAN variant for neural vocoders, which improves waveform synthesis quality by considering score distribution and combining loss functions, applicable to existing GAN vocoders.
Contribution
The paper proposes PRLSGAN, a new relativistic LSGAN framework that enhances GAN-based neural vocoders' quality by incorporating pointwise relative discrepancy loss.
Findings
PRLSGAN improves waveform quality in Parallel WaveGAN and MelGAN.
The framework demonstrates consistent performance gains.
PRLSGAN shows strong generalization across different vocoders.
Abstract
GAN-based neural vocoders, such as Parallel WaveGAN and MelGAN have attracted great interest due to their lightweight and parallel structures, enabling them to generate high fidelity waveform in a real-time manner. In this paper, inspired by Relativistic GAN, we introduce a novel variant of the LSGAN framework under the context of waveform synthesis, named Pointwise Relativistic LSGAN (PRLSGAN). In this approach, we take the truism score distribution into consideration and combine the original MSE loss with the proposed pointwise relative discrepancy loss to increase the difficulty of the generator to fool the discriminator, leading to improved generation quality. Moreover, PRLSGAN is a general-purposed framework that can be combined with any GAN-based neural vocoder to enhance its generation quality. Experiments have shown a consistent performance boost based on Parallel WaveGAN and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Handwritten Text Recognition Techniques
MethodsRelativistic GAN · 1x1 Convolution · Weight Normalization · Average Pooling · Grouped Convolution · Window-based Discriminator · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dilated Convolution · Residual Connection
