Learning to pronounce as measuring cross-lingual joint orthography-phonology complexity
Domenic Rosati

TL;DR
This paper investigates the complexity of learning pronunciation across languages by modeling grapheme-to-phoneme transliteration with transformer models, revealing factors that influence language difficulty and emphasizing the importance of data balance.
Contribution
It introduces a cross-lingual g2p model to analyze pronunciation complexity and identifies key orthographic features affecting language learnability.
Findings
Languages with more complex g2p mappings are harder to learn.
Orthographic transparency correlates with pronunciation ease.
Data sparsity impacts cross-lingual pronunciation modeling.
Abstract
Machine learning models allow us to compare languages by showing how hard a task in each language might be to learn and perform well on. Following this line of investigation, we explore what makes a language "hard to pronounce" by modelling the task of grapheme-to-phoneme (g2p) transliteration. By training a character-level transformer model on this task across 22 languages and measuring the model's proficiency against its grapheme and phoneme inventories, we show that certain characteristics emerge that separate easier and harder languages with respect to learning to pronounce. Namely the complexity of a language's pronunciation from its orthography is due to the expressive or simplicity of its grapheme-to-phoneme mapping. Further discussion illustrates how future studies should consider relative data sparsity per language to design fairer cross-lingual comparison tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
