Mitigating Data Imbalance and Representation Degeneration in   Multilingual Machine Translation

Wen Lai; Alexandra Chronopoulou; Alexander Fraser

arXiv:2305.12786·cs.CL·October 26, 2023·1 cites

Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation

Wen Lai, Alexandra Chronopoulou, Alexander Fraser

PDF

Open Access 1 Repo

TL;DR

This paper introduces Bi-ACL, a novel framework that leverages monolingual data and bilingual dictionaries to address data imbalance and representation degeneration in multilingual neural machine translation, improving performance across languages.

Contribution

The paper proposes Bi-ACL, a new approach combining bidirectional autoencoders and contrastive learning with curriculum sampling to enhance multilingual translation models.

Findings

01

Improves translation quality for low-resource languages.

02

Enhances model robustness in zero-shot translation scenarios.

03

Effective in both high-resource and low-resource language pairs.

Abstract

Despite advances in multilingual neural machine translation (MNMT), we argue that there are still two major challenges in this area: data imbalance and representation degeneration. The data imbalance problem refers to the imbalance in the amount of parallel corpora for all language pairs, especially for long-tail languages (i.e., very low-resource languages). The representation degeneration problem refers to the problem of encoded tokens tending to appear only in a small subspace of the full space available to the MNMT model. To solve these two issues, we propose Bi-ACL, a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model. We define two modules, named bidirectional autoencoder and bidirectional contrastive learning, which we combine with an online constrained beam search and a curriculum learning sampling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lavine-lmu/bi-acl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification