Accented Text-to-Speech Synthesis with Limited Data
Xuehao Zhou, Mingyang Zhang, Yi Zhou, Zhizheng Wu, Haizhou Li

TL;DR
This paper introduces a limited-data accented TTS framework that models phonetic and prosodic variations separately, enabling effective accent rendering with minimal target accent data.
Contribution
It proposes a two-model accented TTS system with pre-training and fine-tuning, specifically designed for low-resource accent adaptation in speech synthesis.
Findings
Effective phonetic variation handling with a small lexicon
Improved prosodic rendering with limited speech data
Enhanced speech quality and accent similarity
Abstract
This paper presents an accented text-to-speech (TTS) synthesis framework with limited training data. We study two aspects concerning accent rendering: phonetic (phoneme difference) and prosodic (pitch pattern and phoneme duration) variations. The proposed accented TTS framework consists of two models: an accented front-end for grapheme-to-phoneme (G2P) conversion and an accented acoustic model with integrated pitch and duration predictors for phoneme-to-Mel-spectrogram prediction. The accented front-end directly models the phonetic variation, while the accented acoustic model explicitly controls the prosodic variation. Specifically, both models are first pre-trained on a large amount of data, then only the accent-related layers are fine-tuned on a limited amount of data for the target accent. In the experiments, speech data of three English accents, i.e., General American English, Irish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research · Natural Language Processing Techniques
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)
