BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context

Aditya Tomar; Nihar Ranjan Sahoo; Pushpak Bhattacharyya

arXiv:2508.07090·cs.CL·August 12, 2025

BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context

Aditya Tomar, Nihar Ranjan Sahoo, Pushpak Bhattacharyya

PDF

Open Access 1 Datasets 1 Video

TL;DR

BharatBBQ is a culturally adapted, multilingual benchmark designed to evaluate social biases in Indian languages within question answering systems, revealing persistent and amplified biases across languages and social categories.

Contribution

The paper introduces BharatBBQ, the first culturally and linguistically tailored bias benchmark for Indian languages, expanding bias evaluation beyond Western-centric datasets.

Findings

01

Bias persists across all evaluated languages and categories.

02

Indian languages often exhibit amplified biases compared to English.

03

Multilingual models show varying bias levels across languages.

Abstract

Evaluating social biases in language models (LMs) is crucial for ensuring fairness and minimizing the reinforcement of harmful stereotypes in AI systems. Existing benchmarks, such as the Bias Benchmark for Question Answering (BBQ), primarily focus on Western contexts, limiting their applicability to the Indian context. To address this gap, we introduce BharatBBQ, a culturally adapted benchmark designed to assess biases in Hindi, English, Marathi, Bengali, Tamil, Telugu, Odia, and Assamese. BharatBBQ covers 13 social categories, including 3 intersectional groups, reflecting prevalent biases in the Indian sociocultural landscape. Our dataset contains 49,108 examples in one language that are expanded using translation and verification to 392,864 examples in eight different languages. We evaluate five multilingual LM families across zero and few-shot settings, analyzing their bias and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

aditya20t/BharatBBQ
dataset· 272 dl
272 dl

Videos

BharatBBQ: A Multilingual Bias Benchmark for Question Answering in the Indian Context· underline

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Ethics and Social Impacts of AI