Talent or Luck? Evaluating Attribution Bias in Large Language Models
Chahat Raj, Mahika Banerjee, Jinhao Pan, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu

TL;DR
This paper introduces a framework to evaluate how large language models assign causes to outcomes, revealing potential biases influenced by demographics and reasoning disparities.
Contribution
It presents a cognitively grounded bias evaluation framework specifically designed to analyze attribution biases in large language models.
Findings
The framework uncovers demographic-based attribution disparities in LLM reasoning.
It highlights the influence of internal and external attributions on model bias.
The approach offers insights into fairness implications of LLM decision-making.
Abstract
When a student fails an exam, do we tend to blame their effort or the test's difficulty? Attribution, defined as how reasons are assigned to event outcomes, shapes perceptions, reinforces stereotypes, and influences decisions. Attribution Theory in social psychology explains how humans assign responsibility for events using implicit cognition, attributing causes to internal (e.g., effort, ability) or external (e.g., task difficulty, luck) factors. LLMs' attribution of event outcomes based on demographics carries important fairness implications. Most works exploring social biases in LLMs focus on surface-level associations or isolated stereotypes. This work proposes a cognitively grounded bias evaluation framework to identify how models' reasoning disparities channelize biases toward demographic groups.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
