Sideways Transliteration: How to Transliterate Multicultural Person Names?
Raphael Cohen, Michael Elhadad

TL;DR
This paper introduces a novel monolingual-resource-based method for transliterating multicultural names, utilizing noisy transliteration and origin-specific ranking, with applications demonstrated from English to Hebrew.
Contribution
It presents an unsupervised approach for transliteration that leverages social media data to learn origin-specific models without requiring parallel corpora.
Findings
Effective transliteration from English to Hebrew demonstrated.
Unsupervised learning of origin-specific models from social media data.
Online web service for English-Hebrew transliteration provided.
Abstract
In a global setting, texts contain transliterated names from many cultural origins. Correct transliteration depends not only on target and source languages but also, on the source language of the name. We introduce a novel methodology for transliteration of names originating in different languages using only monolingual resources. Our method is based on a step of noisy transliteration and then ranking of the results based on origin specific letter models. The transliteration table used for noisy generation is learned in an unsupervised manner for each possible origin language. We present a solution for gathering monolingual training data used by our method by mining of social media sites such as Facebook and Wikipedia. We present results in the context of transliterating from English to Hebrew and provide an online web service for transliteration from English to Hebrew
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Wikis in Education and Collaboration
