Relative Bias: A Comparative Framework for Quantifying Bias in LLMs

Alireza Arbabi; Florian Kerschbaum

arXiv:2505.17131·cs.CL·May 26, 2025

Relative Bias: A Comparative Framework for Quantifying Bias in LLMs

Alireza Arbabi, Florian Kerschbaum

PDF

TL;DR

This paper introduces the Relative Bias framework, a systematic method for comparing biases across large language models by analyzing embedding transformations and using LLMs as evaluators, addressing the challenge of bias quantification.

Contribution

The paper presents a novel comparative framework for quantifying bias in LLMs, combining embedding analysis and LLM-based evaluation methods.

Findings

01

Strong correlation between the two bias scoring methods

02

Framework is systematic, scalable, and statistically grounded

03

Effective in bias and alignment case studies

Abstract

The growing deployment of large language models (LLMs) has amplified concerns regarding their inherent biases, raising critical questions about their fairness, safety, and societal impact. However, quantifying LLM bias remains a fundamental challenge, complicated by the ambiguity of what "bias" entails. This challenge grows as new models emerge rapidly and gain widespread use, while introducing potential biases that have not been systematically assessed. In this paper, we propose the Relative Bias framework, a method designed to assess how an LLM's behavior deviates from other LLMs within a specified target domain. We introduce two complementary methodologies: (1) Embedding Transformation analysis, which captures relative bias patterns through sentence representations over the embedding space, and (2) LLM-as-a-Judge, which employs a language model to evaluate outputs comparatively.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.