Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Ben Peters; Jon Dehdari; Josef van Genabith

arXiv:1708.01464·cs.CL·October 5, 2017·1 cites

Massively Multilingual Neural Grapheme-to-Phoneme Conversion

Ben Peters, Jon Dehdari, Josef van Genabith

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural multilingual g2p system trained on hundreds of languages, sharing parameters across languages, improving accuracy for low-resource languages, and reducing model size.

Contribution

A novel neural sequence-to-sequence multilingual g2p model that leverages shared representations across many languages, enabling better low-resource language conversion.

Findings

01

11% reduction in phoneme error rate for low-resource languages

02

Shared encoder-decoder improves cross-lingual transfer

03

Model is more compact than previous approaches

Abstract

Grapheme-to-phoneme conversion (g2p) is necessary for text-to-speech and automatic speech recognition systems. Most g2p systems are monolingual: they require language-specific data or handcrafting of rules. Such systems are difficult to extend to low resource languages, for which data and handcrafted rules are not available. As an alternative, we present a neural sequence-to-sequence approach to g2p which is trained on spelling--pronunciation pairs in hundreds of languages. The system shares a single encoder and decoder across all languages, allowing it to utilize the intrinsic similarities between different writing systems. We show an 11% improvement in phoneme error rate over an approach based on adapting high-resource monolingual g2p models to low-resource languages. Our model is also much more compact relative to previous approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bpopeters/mg2p
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis