The Paradox of Robustness: Decoupling Rule-Based Logic from Affective Noise in High-Stakes Decision-Making
Jon Chun, Katherine Elkins

TL;DR
This study reveals that aligned large language models are surprisingly robust to emotional framing in rule-based high-stakes decisions, contrasting with their known prompt sensitivity.
Contribution
It uncovers a paradoxical robustness of LLMs to affective bias in rule-bound contexts and provides a benchmark for evaluating this property.
Findings
LLMs show negligible bias (Cohen's h = 0.003) in high-stakes decision tasks.
Bias effects in humans are substantially larger (h in [0.3, 0.8]).
Robustness persists across diverse models and scenarios.
Abstract
While Large Language Models (LLMs) are widely documented to be sensitive to minor prompt perturbations and prone to sycophantic alignment, their robustness in consequential, rule-bound decision-making remains under-explored. We uncover a striking "Paradox of Robustness": despite their known lexical brittleness, aligned LLMs exhibit strong robustness to emotional framing effects in rule-bound institutional decision-making. Using a controlled perturbation framework across three high-stakes domains (healthcare, finance, and education), we find a negligible effect size (Cohen's h = 0.003) compared to the substantial biases observed in analogous human contexts (h in [0.3, 0.8]), approximately two orders of magnitude smaller. This invariance persists across eight models with diverse training paradigms, suggesting the mechanisms driving sycophancy and prompt sensitivity do not translate to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
