Machine Translation in Indian Languages: Challenges and Resolution
Raj Nath Patel, Prakash B. Pimpale, M Sasikumar

TL;DR
This paper addresses structural and morphological challenges in English to Indian language machine translation by employing pre-ordering and suffix separation techniques, leading to improved translation quality.
Contribution
It introduces a novel combination of pre-ordering and suffix separation methods to enhance statistical machine translation for Indian languages.
Findings
Pre-ordering improves syntactic alignment.
Suffix separation reduces morphological divergence.
Translation quality is significantly improved.
Abstract
English to Indian language machine translation poses the challenge of structural and morphological divergence. This paper describes English to Indian language statistical machine translation using pre-ordering and suffix separation. The pre-ordering uses rules to transfer the structure of the source sentences prior to training and translation. This syntactic restructuring helps statistical machine translation to tackle the structural divergence and hence better translation quality. The suffix separation is used to tackle the morphological divergence between English and highly agglutinative Indian languages. We demonstrate that the use of pre-ordering and suffix separation helps in improving the quality of English to Indian Language machine translation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Algorithms and Data Compression
