Breaking Down Multilingual Machine Translation
Ting-Rui Chiang, Yi-Pei Chen, Yi-Ting Yeh, Graham Neubig

TL;DR
This paper investigates how different multilingual training strategies affect the encoder and decoder components of machine translation models, revealing that multilingual training benefits encoders generally but only benefits decoders for low-resource languages, and proposes improvements using related languages.
Contribution
It provides a detailed analysis of how multilingual training impacts encoder and decoder components, and introduces methods to enhance translation performance with related languages.
Findings
Multilingual training benefits encoders across settings.
Decoders benefit from multilingual training mainly for low-resource languages.
Many-to-one and one-to-many models outperform previous results.
Abstract
While multilingual training is now an essential ingredient in machine translation (MT) systems, recent work has demonstrated that it has different effects in different multilingual settings, such as many-to-one, one-to-many, and many-to-many learning. These training settings expose the encoder and the decoder in a machine translation model with different data distributions. In this paper, we examine how different varieties of multilingual training contribute to learning these two components of the MT model. Specifically, we compare bilingual models with encoders and/or decoders initialized by multilingual training. We show that multilingual training is beneficial to encoders in general, while it only benefits decoders for low-resource languages (LRLs). We further find the important attention heads for each language pair and compare their correlations during inference. Our analysis sheds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
