Machine Translation in Indian Languages: Challenges and Resolution

Raj Nath Patel; Prakash B. Pimpale; M Sasikumar

arXiv:1708.07950·cs.CL·August 3, 2018·2 cites

Machine Translation in Indian Languages: Challenges and Resolution

Raj Nath Patel, Prakash B. Pimpale, M Sasikumar

PDF

Open Access

TL;DR

This paper addresses structural and morphological challenges in English to Indian language machine translation by employing pre-ordering and suffix separation techniques, leading to improved translation quality.

Contribution

It introduces a novel combination of pre-ordering and suffix separation methods to enhance statistical machine translation for Indian languages.

Findings

01

Pre-ordering improves syntactic alignment.

02

Suffix separation reduces morphological divergence.

03

Translation quality is significantly improved.

Abstract

English to Indian language machine translation poses the challenge of structural and morphological divergence. This paper describes English to Indian language statistical machine translation using pre-ordering and suffix separation. The pre-ordering uses rules to transfer the structure of the source sentences prior to training and translation. This syntactic restructuring helps statistical machine translation to tackle the structural divergence and hence better translation quality. The suffix separation is used to tackle the morphological divergence between English and highly agglutinative Indian languages. We demonstrate that the use of pre-ordering and suffix separation helps in improving the quality of English to Indian Language machine translation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression