Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects
Sierra S. Liu

TL;DR
This study examines whether large language models exhibit human-like cognitive biases relevant to judicial decision-making, finding some biases are present but less pronounced than in humans, with implications for fairness and reliability.
Contribution
It provides an empirical assessment of cognitive biases in LLMs related to judicial fairness, highlighting differences and similarities with human biases.
Findings
Larger virtuous victim effect observed in LLMs
No significant penalty for adjacent consent in LLMs
Halo effects are slightly reduced compared to humans, except for credentials
Abstract
We investigate whether large language models (LLMs) display human-like cognitive biases, focusing on potential implications for assistance in judicial sentencing, a decision-making system where fairness is paramount. Two of the most relevant biases were chosen: the virtuous victim effect (VVE), with emphasis given to its reduction when adjacent consent is present, and prestige-based halo effects (occupation, company, and credentials). Using vignettes that were altered from prior literature to avoid LLMs recalling from their training data, we isolate each manipulation by holding all other details consistent, then measuring the percentage difference in outcomes. Five models were evaluated as representative LLMs in independent multi-run trials per condition (ChatGPT 5 Instant, ChatGPT 5 Thinking, DeepSeek V3.1, Claude Sonnet 4, Gemini 2.5 Flash). Our research discovers that there is larger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Artificial Intelligence in Healthcare and Education · Legal Language and Interpretation
