Masakhane -- Machine Translation For Africa
Iroro Orife, Julia Kreutzer, Blessing Sibanda, Daniel Whitenack,, Kathleen Siminyu, Laura Martinus, Jamiil Toure Ali, Jade Abbott, Vukosi, Marivate, Salomon Kabongo, Musie Meressa, Espoir Murhabazi, Orevaoghene Ahia,, Elan van Biljon, Arshath Ramkilowan, Adewale Akinfaderin

TL;DR
Masakhane is an open-source initiative that aims to improve machine translation for African languages by building a community, fostering research, and addressing resource and benchmark scarcity in African NLP.
Contribution
This paper introduces the Masakhane project, a collaborative effort to develop NLP resources and benchmarks for African languages, filling a critical gap in the field.
Findings
Successful community building across African countries
Increased research output on African language translation
Development of initial translation benchmarks
Abstract
Africa has over 2000 languages. Despite this, African languages account for a small portion of available resources and publications in Natural Language Processing (NLP). This is due to multiple factors, including: a lack of focus from government and funding, discoverability, a lack of community, sheer language complexity, difficulty in reproducing papers and no benchmarks to compare techniques. To begin to address the identified problems, MASAKHANE, an open-source, continent-wide, distributed, online research effort for machine translation for African languages, was founded. In this paper, we discuss our methodology for building the community and spurring research from the African continent, as well as outline the success of the community in terms of addressing the identified problems affecting African NLP.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Wikis in Education and Collaboration
