Sequence-to-sequence neural network models for transliteration

Mihaela Rosca; Thomas Breuel

arXiv:1610.09565·cs.CL·November 1, 2016·57 cites

Sequence-to-sequence neural network models for transliteration

Mihaela Rosca, Thomas Breuel

PDF

Open Access 1 Repo

TL;DR

This paper shows that neural sequence-to-sequence models achieve near state-of-the-art results in transliteration tasks and provides an open-source Arabic-English dataset and models to facilitate further research.

Contribution

It introduces a new Arabic-English transliteration dataset and trained models, making machine transliteration more accessible.

Findings

01

Neural sequence-to-sequence models achieve top performance in transliteration

02

Open-sourced dataset and models for Arabic-English transliteration

03

Results are close to the state of the art

Abstract

Transliteration is a key component of machine translation systems and software internationalization. This paper demonstrates that neural sequence-to-sequence models obtain state of the art or close to state of the art results on existing datasets. In an effort to make machine transliteration accessible, we open source a new Arabic to English transliteration dataset and our trained models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

googlei18n/transliteration
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Handwritten Text Recognition Techniques