Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages
Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa,, Tajudeen Kolawole, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen, Hassan Muhammad, Salomon Kabongo, Salomey Osei, Sackey Freshia, Rubungo Andre, Niyongabo, Ricky Macharm, Perez Ogayo

TL;DR
This paper advocates for participatory research in low-resource machine translation, demonstrating its effectiveness through a case study on African languages, resulting in new datasets, benchmarks, and community involvement.
Contribution
It introduces participatory research as a scalable approach to develop MT resources for African languages, involving non-experts and producing open datasets and benchmarks.
Findings
Created translation datasets for over 30 African languages
Developed MT benchmarks with human evaluations for a third of these languages
Enabled community members without formal training to contribute scientifically
Abstract
Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved. "Low-resourced"-ness is a complex problem going beyond data availability and reflects systemic problems in society. In this paper, we focus on the task of Machine Translation (MT), that plays a crucial role for information accessibility and communication worldwide. Despite immense improvements in MT over the past decade, MT is centered around a few high-resourced languages. As MT researchers cannot solve the problem of low-resourcedness alone, we propose participatory research as a means to involve all necessary agents required in the MT development process. We demonstrate the feasibility and scalability of participatory research with a case study on MT for African languages. Its implementation leads to a collection of novel translation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
