NLIP_Lab-IITH Multilingual MT System for WAT24 MT Shared Task
Maharaj Brahma, Pramit Sahoo, Maunendra Sankar Desarkar

TL;DR
This paper presents a multilingual machine translation system for 22 Indic languages, utilizing pre-training, bilingual dictionaries, and fine-tuning, achieving competitive results on multiple benchmarks.
Contribution
Introduces a 243M parameter multilingual translation model for 22 Indic languages with novel pre-training and dictionary-based techniques, advancing multilingual Indic MT performance.
Findings
Achieved high chrF++ and BLEU scores on WAT24 benchmarks.
Model is competitive with larger IndicTransv1 system.
Effective use of bilingual dictionaries and fine-tuning improves translation quality.
Abstract
This paper describes NLIP Lab's multilingual machine translation system for the WAT24 shared task on multilingual Indic MT task for 22 scheduled languages belonging to 4 language families. We explore pre-training for Indic languages using alignment agreement objectives. We utilize bi-lingual dictionaries to substitute words from source sentences. Furthermore, we fine-tuned language direction-specific multilingual translation models using small and high-quality seed data. Our primary submission is a 243M parameters multilingual translation model covering 22 Indic languages. In the IN22-Gen benchmark, we achieved an average chrF++ score of 46.80 and 18.19 BLEU score for the En-Indic direction. In the Indic-En direction, we achieved an average chrF++ score of 56.34 and 30.82 BLEU score. In the In22-Conv benchmark, we achieved an average chrF++ score of 43.43 and BLEU score of 16.58 in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Robotics and Automated Systems · Fuzzy Logic and Control Systems
