Multilingual Neural Machine Translation System for Indic to Indic Languages
Sudhansu Bala Das, Divyajyoti Panda, Tapas Kumar Mishra, Bidyut Kr., Patra, Asif Ekbal

TL;DR
This study develops and analyzes multilingual neural machine translation models for Indic languages, exploring the effects of language relatedness, transliteration, and pivoting through English, achieving improved translation quality on multiple language pairs.
Contribution
The paper presents a comprehensive Indic-to-Indic MNMT baseline, investigates relatedness and transliteration effects, and demonstrates the benefits of pivot models and transliteration for low-resource language translation.
Findings
Related languages improve WI group MNMT performance.
Pivot models significantly enhance translation BLEU scores.
Transliteration generally boosts model accuracy across languages.
Abstract
This paper gives an Indic-to-Indic (IL-IL) MNMT baseline model for 11 ILs implemented on the Samanantar corpus and analyzed on the Flores-200 corpus. All the models are evaluated using the BLEU score. In addition, the languages are classified under three groups namely East Indo- Aryan (EI), Dravidian (DR), and West Indo-Aryan (WI). The effect of language relatedness on MNMT model efficiency is studied. Owing to the presence of large corpora from English (EN) to ILs, MNMT IL-IL models using EN as a pivot are also built and examined. To achieve this, English- Indic (EN-IL) models are also developed, with and without the usage of related languages. Results reveal that using related languages is beneficial for the WI group only, while it is detrimental for the EI group and shows an inconclusive effect on the DR group, but it is useful for EN-IL models. Thus, related language groups are used…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
