IrokoBench: A New Benchmark for African Languages in the Age of Large Language Models
David Ifeoluwa Adelani, Jessica Ojo, Israel Abebe Azime, Jian Yun, Zhuang, Jesujoba O. Alabi, Xuanli He, Millicent Ochieng, Sara Hooker, Andiswa, Bukula, En-Shiun Annie Lee, Chiamaka Chukwuneke, Happy Buzaaba, Blessing, Sibanda, Godson Kalipe, Jonathan Mukiibi, Salomon Kabongo

TL;DR
IrokoBench introduces a comprehensive benchmark dataset for 17 low-resource African languages, highlighting significant performance gaps in LLMs and emphasizing the need for more development tailored to these languages.
Contribution
The paper presents IrokoBench, a new multilingual benchmark for African languages, and evaluates LLMs, revealing critical gaps and the impact of translation strategies.
Findings
Large performance gap between high-resource and African languages.
Proprietary models outperform open models significantly.
Translation to English improves performance for some models.
Abstract
Despite the widespread adoption of Large language models (LLMs), their remarkable capabilities remain limited to a few high-resource languages. Additionally, many low-resource languages (\eg African languages) are often evaluated only on basic text classification tasks due to the lack of appropriate or comprehensive benchmarks outside of high-resource languages. In this paper, we introduce IrokoBench -- a human-translated benchmark dataset for 17 typologically-diverse low-resource African languages covering three tasks: natural language inference~(AfriXNLI), mathematical reasoning~(AfriMGSM), and multi-choice knowledge-based question answering~(AfriMMLU). We use IrokoBench to evaluate zero-shot, few-shot, and translate-test settings~(where test sets are translated into English) across 10 open and six proprietary LLMs. Our evaluation reveals a significant performance gap between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Language and cultural evolution · Multilingual Education and Policy
MethodsSparse Evolutionary Training · LLaMA
