Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN
Patrick Lumban Tobing, Yi-Chiao Wu, Tomoki Toda

TL;DR
This paper introduces a baseline voice conversion system combining cyclic variational autoencoder and Parallel WaveGAN, demonstrating effective performance on VCC 2020 tasks with open-source implementation.
Contribution
It presents a novel nonparallel voice conversion system using CycleVAE and PWG, applicable to both intra- and cross-lingual tasks, with publicly available code.
Findings
Achieved MOS of 2.87 for naturalness in Task 1
Attained 75.37% speaker similarity in Task 1
Reached MOS of 2.56 and 56.46% similarity in Task 2
Abstract
In this paper, we present a description of the baseline system of Voice Conversion Challenge (VCC) 2020 with a cyclic variational autoencoder (CycleVAE) and Parallel WaveGAN (PWG), i.e., CycleVAEPWG. CycleVAE is a nonparallel VAE-based voice conversion that utilizes converted acoustic features to consider cyclically reconstructed spectra during optimization. On the other hand, PWG is a non-autoregressive neural vocoder that is based on a generative adversarial network for a high-quality and fast waveform generator. In practice, the CycleVAEPWG system can be straightforwardly developed with the VCC 2020 dataset using a unified model for both Task 1 (intralingual) and Task 2 (cross-lingual), where our open-source implementation is available at https://github.com/bigpon/vcc20_baseline_cyclevae. The results of VCC 2020 have demonstrated that the CycleVAEPWG baseline achieves the following:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsConvolution · Dense Connections · Tanh Activation · WGAN-GP Loss · HuMan(Expedia)||How do I get a human at Expedia? · Phase Shuffle · Dropout · *Communicated@Fast*How Do I Communicate to Expedia? · Solana Customer Service Number +1-833-534-1729 · WaveGAN
