A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
David Ifeoluwa Adelani, Jesujoba Oluwadara Alabi, Angela Fan, Julia, Kreutzer, Xiaoyu Shen, Machel Reid, Dana Ruiter, Dietrich Klakow, Peter, Nabende, Ernie Chang, Tajuddeen Gwadabe, Freshia Sackey, Bonaventure F. P., Dossou, Chris Chinenye Emezue, Colin Leong, Michael Beukman

TL;DR
This paper explores how to adapt large pre-trained multilingual models for low-resource African languages by fine-tuning them on small, high-quality translation datasets, improving translation across languages and domains.
Contribution
It introduces a method for leveraging existing pre-trained models to develop translation systems for African languages not included in initial pre-training.
Findings
Fine-tuning large models on small datasets improves translation quality.
A new African news corpus covering 16 languages was created.
Effective transfer to new domains was demonstrated.
Abstract
Recent advances in the pre-training of language models leverage large-scale datasets to create multilingual models. However, low-resource languages are mostly left out in these datasets. This is primarily because many widely spoken languages are not well represented on the web and therefore excluded from the large-scale crawls used to create datasets. Furthermore, downstream users of these models are restricted to the selection of languages originally chosen for pre-training. This work investigates how to optimally leverage existing pre-trained models to create low-resource translation systems for 16 African languages. We focus on two questions: 1) How can pre-trained models be used for languages not included in the initial pre-training? and 2) How can the resulting translation models effectively transfer to new domains? To answer these questions, we create a new African news corpus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
