Voice Conversion Based on Cross-Domain Features Using Variational Auto   Encoders

Wen-Chin Huang; Hsin-Te Hwang; Yu-Huai Peng; Yu Tsao; Hsin-Min Wang

arXiv:1808.09634·eess.AS·April 9, 2020

Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders

Wen-Chin Huang, Hsin-Te Hwang, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang

PDF

1 Repo

TL;DR

This paper introduces a novel cross-domain VAE framework for voice conversion that leverages multiple spectral features, demonstrating improved performance over traditional VAEs through subjective evaluation.

Contribution

The study proposes a new VAE-based voice conversion method that simultaneously uses multiple spectral features, enhancing conversion quality.

Findings

01

CDVAE outperforms conventional VAE in subjective tests.

02

Using multiple spectral features improves voice conversion quality.

03

Explicit regularization of objectives enhances model performance.

Abstract

An effective approach to non-parallel voice conversion (VC) is to utilize deep neural networks (DNNs), specifically variational auto encoders (VAEs), to model the latent structure of speech in an unsupervised manner. A previous study has confirmed the ef- fectiveness of VAE using the STRAIGHT spectra for VC. How- ever, VAE using other types of spectral features such as mel- cepstral coefficients (MCCs), which are related to human per- ception and have been widely used in VC, have not been prop- erly investigated. Instead of using one specific type of spectral feature, it is expected that VAE may benefit from using multi- ple types of spectral features simultaneously, thereby improving the capability of VAE for VC. To this end, we propose a novel VAE framework (called cross-domain VAE, CDVAE) for VC. Specifically, the proposed framework utilizes both STRAIGHT spectra and MCCs by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

unilight/cdvae-vc
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsUSD Coin Customer Service Number +1-833-534-1729