Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach
V\'ictor M. S\'anchez-Cartagena, Miquel Espl\`a-Gomis, Juan Antonio, P\'erez-Ortiz, Felipe S\'anchez-Mart\'inez

TL;DR
This paper introduces a multi-task data augmentation method for low-resource neural machine translation, generating unfluent sentence pairs to improve encoder robustness and translation quality.
Contribution
It proposes a novel multi-task data augmentation approach using sentence transformations that enhances model robustness and reduces hallucinations in low-resource NMT.
Findings
Consistent improvements on six low-resource translation tasks.
Enhanced reliance on source tokens and robustness against domain shifts.
Reduced hallucinations in translated outputs.
Abstract
In the context of neural machine translation, data augmentation (DA) techniques may be used for generating additional training samples when the available parallel data are scarce. Many DA approaches aim at expanding the support of the empirical data distribution by generating new sentence pairs that contain infrequent words, thus making it closer to the true data distribution of parallel sentences. In this paper, we propose to follow a completely different approach and present a multi-task DA approach in which we generate new sentence pairs with transformations, such as reversing the order of the target sentence, which produce unfluent target sentences. During training, these augmented sentences are used as auxiliary tasks in a multi-task framework with the aim of providing new contexts where the target prefix is not informative enough to predict the next word. This strengthens the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
