Bengali to Assamese Statistical Machine Translation using Moses (Corpus   Based)

Nayan Jyoti Kalita; Baharul Islam

arXiv:1504.01182·cs.CL·April 7, 2015

Bengali to Assamese Statistical Machine Translation using Moses (Corpus Based)

Nayan Jyoti Kalita, Baharul Islam

PDF

TL;DR

This paper develops a Bengali to Assamese statistical machine translation system using Moses, leveraging a parallel corpus and existing tools, aiming to improve machine translation for Indian languages.

Contribution

It presents a novel Bengali to Assamese SMT model using Moses with a specific corpus and tools, addressing a gap in Indian language machine translation research.

Findings

01

Created a Bengali-Assamese translation model using Moses

02

Utilized a 17,100 sentence parallel corpus for training

03

Addresses the lack of statistical MT research for Indian languages

Abstract

Machine dialect interpretation assumes a real part in encouraging man-machine correspondence and in addition men-men correspondence in Natural Language Processing (NLP). Machine Translation (MT) alludes to utilizing machine to change one dialect to an alternate. Statistical Machine Translation is a type of MT consisting of Language Model (LM), Translation Model (TM) and decoder. In this paper, Bengali to Assamese Statistical Machine Translation Model has been created by utilizing Moses. Other translation tools like IRSTLM for Language Model and GIZA-PP-V1.0.7 for Translation model are utilized within this framework which is accessible in Linux situations. The purpose of the LM is to encourage fluent output and the purpose of TM is to encourage similarity between input and output, the decoder increases the probability of translated text in target language. A parallel corpus of 17100…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.