Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020
Ramchandra Joshi, Rushabh Karnavat, Kaustubh Jirapure, Raviraj Joshi

TL;DR
This paper explores domain adaptation techniques for English-Hindi neural machine translation, demonstrating effective fine-tuning and mixed-domain training methods that improve translation quality in specific domains.
Contribution
It compares LSTM and Transformer models for domain-specific NMT and applies simple adaptation techniques, achieving top rankings in shared task evaluations.
Findings
Transformer outperforms LSTM in BLEU scores
Fine-tuning improves domain-specific translation quality
Simple adaptation methods are effective for low-resource domains
Abstract
Recent advancements in Neural Machine Translation (NMT) models have proved to produce a state of the art results on machine translation for low resource Indian languages. This paper describes the neural machine translation systems for the English-Hindi language presented in AdapMT Shared Task ICON 2020. The shared task aims to build a translation system for Indian languages in specific domains like Artificial Intelligence (AI) and Chemistry using a small in-domain parallel corpus. We evaluated the effectiveness of two popular NMT models i.e, LSTM, and Transformer architectures for the English-Hindi machine translation task based on BLEU scores. We train these models primarily using the out of domain data and employ simple domain adaptation techniques based on the characteristics of the in-domain dataset. The fine-tuning and mixed-domain data approaches are used for domain adaptation.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Softmax · Dropout · Attention Is All You Need · Label Smoothing · Sigmoid Activation · Adam
