Revisiting Modularized Multilingual NMT to Meet Industrial Demands
Sungwon Lyu, Bokyung Son, Kichang Yang, and Jaekyoung Bae

TL;DR
This paper explores a modularized multilingual neural machine translation model that shares modules among the same languages, offering a practical industrial solution with improved flexibility, maintainability, and competitive zero-shot performance.
Contribution
It introduces a modular sharing approach (M2) for multilingual NMT that alleviates capacity issues and enhances model modification and incremental training capabilities.
Findings
Multi-way training benefits are retained in M2 without capacity bottlenecks.
Incrementally added modules outperform singly trained modules.
Zero-shot performance of added modules is comparable to supervised models.
Abstract
The complete sharing of parameters for multilingual translation (1-1) has been the mainstream approach in current research. However, degraded performance due to the capacity bottleneck and low maintainability hinders its extensive adoption in industries. In this study, we revisit the multilingual neural machine translation model that only share modules among the same languages (M2) as a practical alternative to 1-1 to satisfy industrial requirements. Through comprehensive experiments, we identify the benefits of multi-way training and demonstrate that the M2 can enjoy these benefits without suffering from the capacity bottleneck. Furthermore, the interlingual space of the M2 allows convenient modification of the model. By leveraging trained modules, we find that incrementally added modules exhibit better performance than singly trained models. The zero-shot performance of the added…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
