TL;DR
This paper presents LMU Munich's unsupervised machine translation system for German<->Upper Sorbian, achieving high BLEU scores through innovative training strategies, data augmentation, and ensembling.
Contribution
The paper introduces a novel combination of monolingual pretraining, pseudo-parallel data, BPE-Dropout, residual adapters, and curriculum learning for unsupervised translation.
Findings
Achieved BLEU scores of 32.4 and 35.2 on the two translation directions.
Demonstrated the effectiveness of residual adapters and BPE-Dropout in low-resource settings.
Ensembling improved overall translation quality.
Abstract
This paper describes the submission of LMU Munich to the WMT 2020 unsupervised shared task, in two language directions, German<->Upper Sorbian. Our core unsupervised neural machine translation (UNMT) system follows the strategy of Chronopoulou et al. (2020), using a monolingual pretrained language generation model (on German) and fine-tuning it on both German and Upper Sorbian, before initializing a UNMT model, which is trained with online backtranslation. Pseudo-parallel data obtained from an unsupervised statistical machine translation (USMT) system is used to fine-tune the UNMT model. We also apply BPE-Dropout to the low resource (Upper Sorbian) data to obtain a more robust system. We additionally experiment with residual adapters and find them useful in the Upper Sorbian->German direction. We explore sampling during backtranslation and curriculum learning to use SMT translations in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLegendre Memory Unit
