Noise-robust voice conversion with domain adversarial training
Hongqiang Du, Lei Xie, Haizhou Li

TL;DR
This paper introduces a noise-robust voice conversion framework that uses domain adversarial training to extract noise-invariant speaker and content representations, significantly improving speech quality and speaker similarity in noisy conditions.
Contribution
The paper proposes a novel encoder-decoder voice conversion model with domain adversarial training to achieve noise-invariant representations, enhancing robustness in real-world noisy environments.
Findings
Effective in synthesizing clean speech from noisy inputs
Improves speech quality and speaker similarity under noisy conditions
Handles unseen noise types during training
Abstract
Voice conversion has made great progress in the past few years under the studio-quality test scenario in terms of speech quality and speaker similarity. However, in real applications, test speech from source speaker or target speaker can be corrupted by various environment noises, which seriously degrade the speech quality and speaker similarity. In this paper, we propose a novel encoder-decoder based noise-robust voice conversion framework, which consists of a speaker encoder, a content encoder, a decoder, and two domain adversarial neural networks. Specifically, we integrate disentangling speaker and content representation technique with domain adversarial training technique. Domain adversarial training makes speaker representations and content representations extracted by speaker encoder and content encoder from clean speech and noisy speech in the same space, respectively. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
