AI4D -- African Language Dataset Challenge
Kathleen Siminyu, Sackey Freshia, Jade Abbott, Vukosi Marivate

TL;DR
The paper describes the organization of the AI4D African Language Dataset Challenge, aimed at encouraging the creation and sharing of annotated datasets for African languages to advance language technology development.
Contribution
It introduces a competitive framework to incentivize the development and dissemination of African language datasets for machine learning applications.
Findings
Increased submission of African language datasets.
Enhanced availability of annotated datasets for African languages.
Fostered community engagement in African language NLP resources.
Abstract
As language and speech technologies become more advanced, the lack of fundamental digital resources for African languages, such as data, spell checkers and Part of Speech taggers, means that the digital divide between these languages and others keeps growing. This work details the organisation of the AI4D - African Language Dataset Challenge, an effort to incentivize the creation, organization and discovery of African language datasets through a competitive challenge. We particularly encouraged the submission of annotated datasets which can be used for training task-specific supervised machine learning models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
