Cheetah: Natural Language Generation for 517 African Languages
Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed

TL;DR
Cheetah is a large multilingual NLG model supporting 517 African languages, significantly improving text generation quality across diverse low-resource languages and fostering linguistic diversity in NLP.
Contribution
We developed Cheetah, the first massively multilingual NLG model for 517 African languages, demonstrating its effectiveness through extensive evaluations and human assessments.
Findings
Cheetah outperforms existing models on five of six generation tasks.
The model generates coherent and contextually appropriate text in many African languages.
Publicly released models promote research and application development for African linguistic diversity.
Abstract
Low-resource African languages pose unique challenges for natural language processing (NLP) tasks, including natural language generation (NLG). In this paper, we develop Cheetah, a massively multilingual NLG language model for African languages. Cheetah supports 517 African languages and language varieties, allowing us to address the scarcity of NLG resources and provide a solution to foster linguistic diversity. We demonstrate the effectiveness of Cheetah through comprehensive evaluations across six generation downstream tasks. In five of the six tasks, Cheetah significantly outperforms other models, showcasing its remarkable performance for generating coherent and contextually appropriate text in a wide range of African languages. We additionally conduct a detailed human evaluation to delve deeper into the linguistic capabilities of Cheetah. The introduction of Cheetah has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
