AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models

Yann Le Beux; Oluchi Audu; Oche D. Ankeli; Dhananjay Balakrishnan; Melissah Weya; Marie D. Ralaiarinosy; Ignatius Ezeani

arXiv:2511.22016·cs.CL·December 1, 2025

AfriStereo: A Culturally Grounded Dataset for Evaluating Stereotypical Bias in Large Language Models

Yann Le Beux, Oluchi Audu, Oche D. Ankeli, Dhananjay Balakrishnan, Melissah Weya, Marie D. Ralaiarinosy, Ignatius Ezeani

PDF

Open Access

TL;DR

AfriStereo introduces the first African culturally grounded stereotype dataset and evaluation framework, revealing biases in language models and providing tools for more inclusive AI development.

Contribution

It presents a novel African stereotype dataset and evaluation methodology, addressing the lack of culturally relevant bias benchmarks in NLP.

Findings

01

Nine out of eleven models show significant bias towards stereotypes.

02

Bias Preference Ratios range from 0.63 to 0.78, indicating systematic stereotypes.

03

Domain-specific models exhibit weaker bias, suggesting task-specific training reduces stereotypes.

Abstract

Existing AI bias evaluation benchmarks largely reflect Western perspectives, leaving African contexts underrepresented and enabling harmful stereotypes in applications across various domains. To address this gap, we introduce AfriStereo, the first open-source African stereotype dataset and evaluation framework grounded in local socio-cultural contexts. Through community engaged efforts across Senegal, Kenya, and Nigeria, we collected 1,163 stereotypes spanning gender, ethnicity, religion, age, and profession. Using few-shot prompting with human-in-the-loop validation, we augmented the dataset to over 5,000 stereotype-antistereotype pairs. Entries were validated through semantic clustering and manual annotation by culturally informed reviewers. Preliminary evaluation of language models reveals that nine of eleven models exhibit statistically significant bias, with Bias Preference Ratios…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Computational and Text Analysis Methods · Hate Speech and Cyberbullying Detection