A Comparison of Different Machine Transliteration Models

K. Choi; H. Isahara; J. Oh

arXiv:1110.1391·cs.CL·October 10, 2011

A Comparison of Different Machine Transliteration Models

K. Choi, H. Isahara, J. Oh

PDF

TL;DR

This paper compares four different machine transliteration models within a unified framework, revealing their relative effectiveness and how they can complement each other to enhance transliteration accuracy.

Contribution

It introduces a unified framework for comparing multiple transliteration models and demonstrates their complementary strengths for improved performance.

Findings

01

Hybrid and correspondence-based models are most effective.

02

All four models can be combined for better results.

03

Unified comparison framework was successfully developed.

Abstract

Machine transliteration is a method for automatically converting words in one language into phonetically equivalent ones in another language. Machine transliteration plays an important role in natural language applications such as information retrieval and machine translation, especially for handling proper nouns and technical terms. Four machine transliteration models -- grapheme-based transliteration model, phoneme-based transliteration model, hybrid transliteration model, and correspondence-based transliteration model -- have been proposed by several researchers. To date, however, there has been little research on a framework in which multiple transliteration models can operate simultaneously. Furthermore, there has been no comparison of the four models within the same framework and using the same data. We addressed these problems by 1) modeling the four models within the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.