Obscured but Not Erased: Evaluating Nationality Bias in LLMs via Name-Based Bias Benchmarks
Giulio Pelosio, Devesh Batra, No\'emie Bovey, Robert Hankache, Cristovao Iglesias, Greig Cowan, Raad Khraishi

TL;DR
This paper introduces a name-based bias benchmark to evaluate nationality bias in large language models, revealing that smaller models exhibit more bias and retain more errors, highlighting bias resilience in LLMs.
Contribution
It presents a novel name-based benchmarking method to assess nationality bias in LLMs, demonstrating bias and accuracy differences across model sizes and providers.
Findings
Small models show higher bias and lower accuracy.
Name substitution impacts bias and error retention.
Bias persists even in larger models like GPT-4o.
Abstract
Large Language Models (LLMs) can exhibit latent biases towards specific nationalities even when explicit demographic markers are not present. In this work, we introduce a novel name-based benchmarking approach derived from the Bias Benchmark for QA (BBQ) dataset to investigate the impact of substituting explicit nationality labels with culturally indicative names, a scenario more reflective of real-world LLM applications. Our novel approach examines how this substitution affects both bias magnitude and accuracy across a spectrum of LLMs from industry leaders such as OpenAI, Google, and Anthropic. Our experiments show that small models are less accurate and exhibit more bias compared to their larger counterparts. For instance, on our name-based dataset and in the ambiguous context (where the correct choice is not revealed), Claude Haiku exhibited the worst stereotypical bias scores of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Library Science and Information Systems · Legal Education and Practice Innovations
