PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models

Farhan Farsi; Shayan Bali; Fatemeh Valeh; Parsa Ghofrani; Alireza Pakniat; Kian Kashfipour; Amir H. Payberah

arXiv:2510.19616·cs.CL·October 23, 2025

PBBQ: A Persian Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models

Farhan Farsi, Shayan Bali, Fatemeh Valeh, Parsa Ghofrani, Alireza Pakniat, Kian Kashfipour, Amir H. Payberah

PDF

Open Access

TL;DR

This paper introduces PBBQ, a large Persian bias benchmark dataset created with human-AI collaboration, to evaluate and address social biases in Persian language models, revealing significant biases and similarities to human stereotypes.

Contribution

The paper presents the first comprehensive Persian social bias benchmark dataset developed with social science expertise, enabling evaluation and mitigation of biases in Persian LLMs.

Findings

01

Current Persian LLMs exhibit significant social biases.

02

Models often replicate human bias patterns.

03

PBBQ provides a valuable resource for bias evaluation.

Abstract

With the increasing adoption of large language models (LLMs), ensuring their alignment with social norms has become a critical concern. While prior research has examined bias detection in various languages, there remains a significant gap in resources addressing social biases within Persian cultural contexts. In this work, we introduce PBBQ, a comprehensive benchmark dataset designed to evaluate social biases in Persian LLMs. Our benchmark, which encompasses 16 cultural categories, was developed through questionnaires completed by 250 diverse individuals across multiple demographics, in close collaboration with social science experts to ensure its validity. The resulting PBBQ dataset contains over 37,000 carefully curated questions, providing a foundation for the evaluation and mitigation of bias in Persian language models. We benchmark several open-source LLMs, a closed-source model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods · Explainable Artificial Intelligence (XAI) · Topic Modeling