The Voice Conversion Challenge 2018: Promoting Development of Parallel   and Nonparallel Methods

Jaime Lorenzo-Trueba; Junichi Yamagishi; Tomoki Toda; Daisuke Saito,; Fernando Villavicencio; Tomi Kinnunen; Zhenhua Ling

arXiv:1804.04262·eess.AS·April 13, 2018·66 cites

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods

Jaime Lorenzo-Trueba, Junichi Yamagishi, Tomoki Toda, Daisuke Saito,, Fernando Villavicencio, Tomi Kinnunen, Zhenhua Ling

PDF

Open Access

TL;DR

The Voice Conversion Challenge 2018 provided a standardized framework for evaluating and comparing state-of-the-art voice conversion systems, including both parallel and non-parallel data approaches, through large-scale perceptual testing.

Contribution

This paper introduces the 2018 challenge framework, encompassing new tasks and evaluation methods, to advance the development of voice conversion technology.

Findings

01

23 teams participated with diverse systems

02

Large-scale perceptual evaluation conducted

03

Results highlight progress and remaining challenges in VC

Abstract

We present the Voice Conversion Challenge 2018, designed as a follow up to the 2016 edition with the aim of providing a common framework for evaluating and comparing different state-of-the-art voice conversion (VC) systems. The objective of the challenge was to perform speaker conversion (i.e. transform the vocal identity) of a source speaker to a target speaker while maintaining linguistic information. As an update to the previous challenge, we considered both parallel and non-parallel data to form the Hub and Spoke tasks, respectively. A total of 23 teams from around the world submitted their systems, 11 of them additionally participated in the optional Spoke task. A large-scale crowdsourced perceptual evaluation was then carried out to rate the submitted converted speech in terms of naturalness and similarity to the target speaker identity. In this paper, we present a brief summary…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders