Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment
Adrian Sauter, Mona Schirmer

TL;DR
This paper investigates how large language models (LLMs) adapt their moral judgments across different contexts, revealing their sensitivity to moral variations and proposing a method to control this sensitivity.
Contribution
It introduces the Contextual MoralChoice dataset and an activation steering method to analyze and modulate LLMs' context sensitivity in moral judgment.
Findings
LLMs are highly context-sensitive, often shifting towards rule-violating judgments.
Models and humans respond differently to contextual variations.
Activation steering can reliably adjust a model's sensitivity to context.
Abstract
A human's moral decision depends heavily on the context. Yet research on LLM morality has largely studied fixed scenarios. We address this gap by introducing Contextual MoralChoice, a dataset of moral dilemmas with systematic contextual variations known from moral psychology to shift human judgment: consequentialist, emotional, and relational. Evaluating 22 LLMs, we find that nearly all models are context-sensitive, shifting their judgments toward rule-violating behavior. Comparing with a human survey, we find that models and humans are most triggered by different contextual variations, and that a model aligned with human judgments in the base case is not necessarily aligned in its contextual sensitivity. This raises the question of controlling contextual sensitivity, which we address with an activation steering approach that can reliably increase or decrease a model's contextual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychology of Moral and Emotional Judgment · Ethics and Social Impacts of AI · Ethics in Business and Education
