Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation
Tianle Yang, Chengzhe Sun, Phil Rose, Cassandra L. Jacobs, Siwei Lyu

TL;DR
This paper introduces a prosodic probing framework to evaluate neural TTS systems' ability to model consonant-induced F0 perturbation, revealing their reliance on lexical memorization over abstract prosodic encoding.
Contribution
The study presents a novel segmental-level diagnostic tool for assessing prosodic modeling in neural TTS systems and compares their generalization capabilities across different lexical frequencies.
Findings
High accuracy for high-frequency words
Poor generalization to low-frequency words
TTS systems rely more on memorization than prosodic abstraction
Abstract
This study proposes a segmental-level prosodic probing framework to evaluate neural TTS models' ability to reproduce consonant-induced f0 perturbation, a fine-grained segmental-prosodic effect that reflects local articulatory mechanisms. We compare synthetic and natural speech realizations for thousands of words, stratified by lexical frequency, using Tacotron 2 and FastSpeech 2 trained on the same speech corpus (LJ Speech). These controlled analyses are then complemented by a large-scale evaluation spanning multiple advanced TTS systems. Results show accurate reproduction for high-frequency words but poor generalization to low-frequency items, suggesting that the examined TTS architectures rely more on lexical-level memorization than on abstract segmental-prosodic encoding. This finding highlights a limitation in such TTS systems' ability to generalize prosodic detail beyond seen data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhonetics and Phonology Research · Neurobiology of Language and Bilingualism · Stuttering Research and Treatment
