Improving Neural Machine Translation of Indigenous Languages with Multilingual Transfer Learning
Wei-Rui Chen, Muhammad Abdul-Mageed

TL;DR
This paper presents a transfer learning approach using multilingual pretrained models to improve machine translation for ten South American Indigenous languages from Spanish, achieving state-of-the-art results despite limited data.
Contribution
It introduces a transfer learning method leveraging multilingual pretrained models for low-resource Indigenous language translation, outperforming previous data augmentation techniques.
Findings
Set new SOTA on five language pairs
Doubled performance on one language pair
Effective in low-resource settings
Abstract
Machine translation (MT) involving Indigenous languages, including those possibly endangered, is challenging due to lack of sufficient parallel data. We describe an approach exploiting bilingual and multilingual pretrained MT models in a transfer learning setting to translate from Spanish to ten South American Indigenous languages. Our models set new SOTA on five out of the ten language pairs we consider, even doubling performance on one of these five pairs. Unlike previous SOTA that perform data augmentation to enlarge the train sets, we retain the low-resource setting to test the effectiveness of our models under such a constraint. In spite of the rarity of linguistic information available about the Indigenous languages, we offer a number of quantitative and qualitative analyses (e.g., as to morphology, tokenization, and orthography) to contextualize our results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)
