Data-augmented cross-lingual synthesis in a teacher-student framework
Marcel de Korte, Jaebok Kim, Aki Kunikoshi, Adaeze Adigwe, Esther, Klabbers

TL;DR
This paper introduces a teacher-student framework with data augmentation for cross-lingual speech synthesis, enhancing speaker characteristic retention and naturalness in synthesized speech across languages.
Contribution
It proposes a novel data augmentation method using a teacher model to improve speaker characteristic preservation in cross-lingual synthesis within a teacher-student paradigm.
Findings
Improved retention of speaker characteristics in synthesized speech.
Maintained high naturalness and prosodic variation.
Enhanced generalization to unseen speaker-language pairs.
Abstract
Cross-lingual synthesis can be defined as the task of letting a speaker generate fluent synthetic speech in another language. This is a challenging task, and resulting speech can suffer from reduced naturalness, accented speech, and/or loss of essential voice characteristics. Previous research shows that many models appear to have insufficient generalization capabilities to perform well on every of these cross-lingual aspects. To overcome these generalization problems, we propose to apply the teacher-student paradigm to cross-lingual synthesis. While a teacher model is commonly used to produce teacher forced data, we propose to also use it to produce augmented data of unseen speaker-language pairs, where the aim is to retain essential speaker characteristics. Both sets of data are then used for student model training, which is trained to retain the naturalness and prosodic variation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and dialogue systems · Natural Language Processing Techniques
