Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu,, Zhifeng Chen, Nikhil Thorat, Fernanda Vi\'egas, Martin Wattenberg, Greg, Corrado, Macduff Hughes, Jeffrey Dean

TL;DR
This paper introduces a simple, single-model multilingual neural machine translation system that uses an artificial token to specify target languages, enabling zero-shot translation and achieving state-of-the-art results without increasing model complexity.
Contribution
The authors propose a minimal modification approach for multilingual NMT using a shared model and vocabulary, enabling zero-shot translation and improved performance across multiple language pairs.
Findings
Achieves comparable or superior results to state-of-the-art models on WMT benchmarks.
Enables zero-shot translation between language pairs not seen during training.
Supports translation across up to twelve languages with improved quality.
Abstract
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no change in the model architecture from our base system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. The rest of the model, which includes encoder, decoder and attention, remains unchanged and is shared across all languages. Using a shared wordpiece vocabulary, our approach enables Multilingual NMT using a single model without any increase in parameters, which is significantly simpler than previous proposals for Multilingual NMT. Our method often improves the translation quality of all involved language pairs, even while keeping the total number of model parameters constant. On the WMT'14 benchmarks, a single multilingual model achieves comparable performance for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
