Building a Multi-domain Neural Machine Translation Model using Knowledge Distillation
Idriss Mghabbar, Pirashanth Ratnamogan

TL;DR
This paper introduces a scalable training pipeline using knowledge distillation and multiple specialized teachers to improve multi-domain neural machine translation, outperforming traditional finetuning methods.
Contribution
The authors propose a novel training pipeline that leverages knowledge distillation and multiple teachers for efficient multi-domain NMT adaptation without extra inference costs.
Findings
Improved BLEU scores by up to 2 points across 2 to 4 domains.
Efficient finetuning without additional inference costs.
Outperforms simple mixed-finetuning methods.
Abstract
Lack of specialized data makes building a multi-domain neural machine translation tool challenging. Although emerging literature dealing with low resource languages starts to show promising results, most state-of-the-art models used millions of sentences. Today, the majority of multi-domain adaptation techniques are based on complex and sophisticated architectures that are not adapted for real-world applications. So far, no scalable method is performing better than the simple yet effective mixed-finetuning, i.e finetuning a generic model with a mix of all specialized data and generic data. In this paper, we propose a new training pipeline where knowledge distillation and multiple specialized teachers allow us to efficiently finetune a model without adding new costs at inference time. Our experiments demonstrated that our training pipeline allows improving the performance of multi-domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsKnowledge Distillation
