Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation
Wen Lai, Alexandra Chronopoulou, Alexander Fraser

TL;DR
This paper introduces Bi-ACL, a novel framework that leverages monolingual data and bilingual dictionaries to address data imbalance and representation degeneration in multilingual neural machine translation, improving performance across languages.
Contribution
The paper proposes Bi-ACL, a new approach combining bidirectional autoencoders and contrastive learning with curriculum sampling to enhance multilingual translation models.
Findings
Improves translation quality for low-resource languages.
Enhances model robustness in zero-shot translation scenarios.
Effective in both high-resource and low-resource language pairs.
Abstract
Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration problem refers to the problem of encoded tokens tending to appear only in a small subspace of the full space available to the MNMT model. To solve these two issues, we propose Bi-ACL, a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model. We define two modules, named bidirectional autoencoder and bidirectional contrastive learning, which we combine with an online constrained beam search and a curriculum learning sampling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
