Invisible Influences: Investigating Implicit Intersectional Biases through Persona Engineering in Large Language Models

Nandini Arimanda; Achyuth Mukund; Sakthi Balan Muthiah; Rajesh Sharma

arXiv:2604.06213·cs.CL·April 9, 2026

Invisible Influences: Investigating Implicit Intersectional Biases through Persona Engineering in Large Language Models

Nandini Arimanda, Achyuth Mukund, Sakthi Balan Muthiah, Rajesh Sharma

PDF

TL;DR

This paper introduces BADx, a scalable metric to measure how persona contexts influence intersectional biases in large language models, revealing dynamic bias shifts that static tests miss.

Contribution

The study develops BADx, combining bias scores, sensitivity, and volatility, to detect persona-induced bias amplification and explainability in LLMs, advancing bias evaluation methods.

Findings

01

Bias context significantly modulates model biases.

02

GPT-4o shows high bias sensitivity and volatility.

03

LLaMA-4 maintains low volatility and stable bias profile.

Abstract

Large Language Models (LLMs) excel at human-like language generation but often embed and amplify implicit, intersectional biases, especially under persona-driven contexts. Existing bias audits rely on static, embedding-based tests (CEAT, I-WEAT, I-SEAT) that quantify absolute association strengths. We show that they have limitations in capturing dynamic shifts when models adopt social roles. We address this gap by introducing the Bias Amplification Differential and Explainability Score (BADx): a novel, scalable metric that measures persona-induced bias amplification and integrates local explainability insights. BADx comprises three components - differential bias scores (BAD, based on CEAT, I-WEAT, I-SEAT),Persona Sensitivity Index (PSI), and Volatility (Standard Deviation), augmented by LIME-based analysis for emphasizing explainability. This study is divided and performed as two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.