An Adaptive Learning based Generative Adversarial Network for One-To-One Voice Conversion
Sandipan Dhar, Nanda Dulal Jana, Swagatam Das

TL;DR
This paper introduces ALGAN-VC, an adaptive learning-based GAN model with a Dense Residual Network architecture for improved one-to-one voice conversion, enhancing speech quality and speaker similarity.
Contribution
The paper proposes a novel adaptive learning mechanism and a Dense Residual Network architecture within a GAN framework for more effective voice conversion.
Findings
Achieved high speaker similarity in converted speech.
Demonstrated improved speech quality through subjective and objective evaluations.
Validated on multiple datasets including VCC 2016, 2018, 2020, and a custom Indian language dataset.
Abstract
Voice Conversion (VC) emerged as a significant domain of research in the field of speech synthesis in recent years due to its emerging application in voice-assisting technology, automated movie dubbing, and speech-to-singing conversion to name a few. VC basically deals with the conversion of vocal style of one speaker to another speaker while keeping the linguistic contents unchanged. VC task is performed through a three-stage pipeline consisting of speech analysis, speech feature mapping, and speech reconstruction. Nowadays the Generative Adversarial Network (GAN) models are widely in use for speech feature mapping from source to target speaker. In this paper, we propose an adaptive learning-based GAN model called ALGAN-VC for an efficient one-to-one VC of speakers. Our ALGAN-VC framework consists of some approaches to improve the speech quality and voice similarity between source and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
