Grapheme-to-Phoneme Transformer Model for Transfer Learning Dialects
Eric Engelhart, Mahsa Elyasi, Gaurav Bharaj

TL;DR
This paper introduces a transformer-based G2P model that adapts to dialects with limited data and small dictionaries, demonstrating significant improvements in phoneme error rate for dialects like Indian and British English.
Contribution
It presents a novel transformer-based approach for G2P that effectively transfers learning to dialects with minimal data and small dictionaries, enhancing adaptability.
Findings
Pretrained model achieves PER of 2.469% on British dialect.
From-scratch model achieves PER of 26.877% with limited data.
Method shows potential for accent transfer in TTS systems.
Abstract
Grapheme-to-Phoneme (G2P) models convert words to their phonetic pronunciations. Classic G2P methods include rule-based systems and pronunciation dictionaries, while modern G2P systems incorporate learning, such as, LSTM and Transformer-based attention models. Usually, dictionary-based methods require significant manual effort to build, and have limited adaptivity on unseen words. And transformer-based models require significant training data, and do not generalize well, especially for dialects with limited data. We propose a novel use of transformer-based attention model that can adapt to unseen dialects of English language, while using a small dictionary. We show that our method has potential applications for accent transfer for text-to-speech, and for building robust G2P models for dialects with limited pronunciation dictionary size. We experiment with two English dialects:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
