Extending Multilingual Machine Translation through Imitation Learning
Wen Lai, Viktor Hangya, Yingli Shen, Alexander Fraser

TL;DR
This paper introduces Imit-MNMT, a novel imitation learning approach to extend multilingual neural machine translation models to new languages using pseudo-parallel data, effectively improving translation quality and reducing forgetting.
Contribution
The paper presents a new imitation learning framework for multilingual translation that leverages expert-generated pseudo-data to incorporate new languages without degrading existing performance.
Findings
Significant improvement in translation quality for new languages.
Reduced catastrophic forgetting in multilingual models.
Effective use of imitation learning in NLP tasks.
Abstract
Despite the growing variety of languages supported by existing multilingual neural machine translation (MNMT) models, most of the world's languages are still being left behind. We aim to extend large-scale MNMT models to incorporate a new language, enabling translations between this new language and all previously supported languages, even in the challenging scenario where only a parallel corpus between the new language and English is available. Previous methods, such as continued training on parallel data including the new language, often suffer from catastrophic forgetting, which degrades performance on other languages. We propose a novel approach Imit-MNMT which treats this task as an imitation learning problem, a technique widely used in computer vision but less explored in natural language processing. Specifically, we leverage an expert model to generate pseudo-parallel corpora…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Multimodal Machine Learning Applications · Topic Modeling
