Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant   Environments

Alejandro Mottini; Jaime Lorenzo-Trueba; Sri Vishnu Kumar Karlapati,; Thomas Drugman

arXiv:2106.08873·cs.SD·June 17, 2021

Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments

Alejandro Mottini, Jaime Lorenzo-Trueba, Sri Vishnu Kumar Karlapati,, Thomas Drugman

PDF

1 Repo

TL;DR

Voicy is a novel zero-shot voice conversion framework designed to operate effectively in noisy and reverberant environments, outperforming existing methods in naturalness and speaker similarity.

Contribution

It introduces a multi-encoder architecture inspired by de-noising auto-encoders for non-parallel zero-shot voice conversion in challenging acoustic conditions.

Findings

01

Voicy outperforms existing VC methods in noisy reverberant settings.

02

The framework achieves higher naturalness and speaker similarity.

03

Validated on a noisy reverberant LibriSpeech dataset.

Abstract

Voice Conversion (VC) is a technique that aims to transform the non-linguistic information of a source utterance to change the perceived identity of the speaker. While there is a rich literature on VC, most proposed methods are trained and evaluated on clean speech recordings. However, many acoustic environments are noisy and reverberant, severely restricting the applicability of popular VC methods to such scenarios. To address this limitation, we propose Voicy, a new VC framework particularly tailored for noisy speech. Our method, which is inspired by the de-noising auto-encoders framework, is comprised of four encoders (speaker, content, phonetic and acoustic-ASR) and one decoder. Importantly, Voicy is capable of performing non-parallel zero-shot VC, an important requirement for any VC system that needs to work on speakers not seen during training. We have validated our approach using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alexa/amazon-voice-conversion-voicy
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.