Intersectional Fairness in Large Language Models
Chaima Boufaied, Ronnie De Souza Santos, Ann Barcomb

TL;DR
This paper evaluates intersectional fairness in large language models, revealing biases and inconsistencies that challenge their reliability and fairness across demographic intersections.
Contribution
It systematically assesses fairness, bias, and consistency in six LLMs using diverse metrics and contexts, highlighting limitations in current models' fairness.
Findings
Models perform well in ambiguous contexts but with limited fairness metric informativeness.
Accuracy is higher when answers reinforce stereotypes, especially in race-gender intersections.
Outcome disparities and response inconsistencies persist across intersectional groups.
Abstract
Large Language Models (LLMs) are increasingly deployed in socially sensitive settings, raising concerns about fairness and biases, particularly across intersectional demographic attributes. In this paper, we systematically evaluate intersectional fairness in six LLMs using ambiguous and disambiguated contexts from two benchmark datasets. We assess LLM behavior using bias scores, subgroup fairness metrics, accuracy, and consistency through multi-run analysis across contexts and negative and non-negative question polarities. Our results show that while modern LLMs generally perform well in ambiguous contexts, this limits the informativeness of fairness metrics due to sparse non-unknown predictions. In disambiguated contexts, LLM accuracy is influenced by stereotype alignment, with models being more accurate when the correct answer reinforces a stereotype than when it contradicts it. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
