Self-Learning for Zero Shot Neural Machine Translation
Surafel M. Lakew, Matteo Negri, Marco Turchi

TL;DR
This paper introduces a novel zero-shot neural machine translation method that learns without relying on pivot language data, significantly improving translation quality for low-resource language pairs through a self-learning cycle.
Contribution
The work presents a new zero-shot NMT approach that does not require shared parallel data with a pivot language, using a self-learning cycle to enhance translation performance.
Findings
Up to +5.93 BLEU improvement over supervised baseline
Effective across diverse language pairs with different scripts and relatedness
Outperforms unsupervised NMT, especially in domain-mismatch scenarios
Abstract
Neural Machine Translation (NMT) approaches employing monolingual data are showing steady improvements in resource rich conditions. However, evaluations using real-world low-resource languages still result in unsatisfactory performance. This work proposes a novel zero-shot NMT modeling approach that learns without the now-standard assumption of a pivot language sharing parallel data with the zero-shot source and target languages. Our approach is based on three stages: initialization from any pre-trained NMT model observing at least the target language, augmentation of source sides leveraging target monolingual data, and learning to optimize the initial model to the zero-shot pair, where the latter two constitute a self-learning cycle. Empirical findings involving four diverse (in terms of a language family, script and relatedness) zero-shot pairs show the effectiveness of our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsSelf-Learning
