Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models
Bumjin Park, Jinsil Lee, Jaesik Choi

TL;DR
This paper uncovers a bias in large language models where modal words like 'must' or 'ought to' cause the models to overestimate obligations, affecting their moral and ethical reasoning.
Contribution
It introduces Deontological Keyword Bias (DKB), demonstrating its prevalence across models and proposing a mitigation strategy using combined few-shot and reasoning prompts.
Findings
LLMs judge over 90% of scenarios as obligations with modal prompts
DKB is consistent across various LLMs and question formats
Mitigation via combined few-shot and reasoning prompts reduces bias
Abstract
Large language models (LLMs) are increasingly engaging in moral and ethical reasoning, where criteria for judgment are often unclear, even for humans. While LLM alignment studies cover many areas, one important yet underexplored area is how LLMs make judgments about obligations. This work reveals a strong tendency in LLMs to judge non-obligatory contexts as obligations when prompts are augmented with modal expressions such as must or ought to. We introduce this phenomenon as Deontological Keyword Bias (DKB). We find that LLMs judge over 90\% of commonsense scenarios as obligations when modal expressions are present. This tendency is consist across various LLM families, question types, and answer formats. To mitigate DKB, we propose a judgment strategy that integrates few-shot examples with reasoning prompts. This study sheds light on how modal expressions, as a form of linguistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
