Assamese-English Bilingual Machine Translation
Kalyanee Kanchan Baruah, Pranjal Das, Abdul Hannan, Shikhar Kr. Sarma

TL;DR
This paper presents a statistical machine translation system for Assamese-English using Moses, GIZA, and IRSTLM, addressing challenges of morphological richness and out-of-vocabulary words with transliteration.
Contribution
It introduces a bilingual translation system for Assamese-English and incorporates a transliteration component to handle OOV words, improving translation quality.
Findings
BLEU scores indicate translation difficulty from English to Assamese due to morphological complexity.
Transliteration helps manage proper nouns and OOV words effectively.
The system demonstrates the feasibility of statistical translation for low-resource Indian languages.
Abstract
Machine translation is the process of translating text from one language to another. In this paper, Statistical Machine Translation is done on Assamese and English language by taking their respective parallel corpus. A statistical phrase based translation toolkit Moses is used here. To develop the language model and to align the words we used two another tools IRSTLM, GIZA respectively. BLEU score is used to check our translation system performance, how good it is. A difference in BLEU scores is obtained while translating sentences from Assamese to English and vice-versa. Since Indian languages are morphologically very rich hence translation is relatively harder from English to Assamese resulting in a low BLEU score. A statistical transliteration system is also introduced with our translation system to deal basically with proper nouns, OOV (out of vocabulary) words which are not present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Translation Studies and Practices
