A Scalable Entity-Based Framework for Auditing Bias in LLMs

Akram Elbouanani; Aboubacar Tuo; Adrian Popescu

arXiv:2601.12374·cs.CL·May 12, 2026

A Scalable Entity-Based Framework for Auditing Bias in LLMs

Akram Elbouanani, Aboubacar Tuo, Adrian Popescu

PDF

TL;DR

This paper presents a scalable, entity-based bias auditing framework for LLMs that uses synthetic data to reliably detect systematic biases across multiple dimensions, revealing persistent and amplifiable biases.

Contribution

The authors introduce a novel, scalable bias auditing framework using controlled synthetic data and conduct the largest bias audit to date across diverse models, languages, and tasks.

Findings

01

Models favor left-wing politicians over right-wing ones.

02

Biases favor Western countries and companies over the Global South.

03

Increasing model size amplifies biases despite instruction tuning.

Abstract

Existing approaches to bias evaluation in large language models (LLMs) trade ecological validity for statistical control, relying either on artificial prompts that poorly reflect real-world use or on naturalistic tasks that lack scale and rigor. We introduce a scalable bias-auditing framework that uses named entities as controlled probes to measure systematic disparities in model behavior. Synthetic data enables us to construct diverse, controlled inputs, and we show that it reliably reproduces bias patterns observed in natural text, supporting its use for large-scale analysis. Using this framework, we conduct the largest bias audit to date, comprising 1.9 billion data points across multiple entity types, tasks, languages, models, and prompting strategies. We find consistent patterns: models penalize right-wing politicians and favor left-wing politicians, prefer Western and wealthier…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.