When Prohibitions Become Permissions: Auditing Negation Sensitivity in Language Models

Katherine Elkins; Jon Chun

arXiv:2601.21433·cs.AI·January 30, 2026

When Prohibitions Become Permissions: Auditing Negation Sensitivity in Language Models

Katherine Elkins, Jon Chun

PDF

Open Access

TL;DR

This paper reveals that many large language models fail to correctly interpret negations, often endorsing prohibited actions, which raises concerns for safe deployment in sensitive applications.

Contribution

The study audits 16 models across various ethical scenarios, introduces the Negation Sensitivity Index (NSI), and proposes a tiered certification framework for safer AI deployment.

Findings

01

Open-source models endorse prohibited actions 77% of the time under simple negation.

02

Commercial models show 19-128% swings in negation interpretation.

03

Agreement between models drops from 74% to 62% on negated prompts.

Abstract

When a user tells an AI system that someone "should not" take an action, the system ought to treat this as a prohibition. Yet many large language models do the opposite: they interpret negated instructions as affirmations. We audited 16 models across 14 ethical scenarios and found that open-source models endorse prohibited actions 77% of the time under simple negation and 100% under compound negation -- a 317% increase over affirmative framing. Commercial models fare better but still show swings of 19-128%. Agreement between models drops from 74% on affirmative prompts to 62% on negated ones, and financial scenarios prove twice as fragile as medical ones. These patterns hold under deterministic decoding, ruling out sampling noise. We present case studies showing how these failures play out in practice, propose the Negation Sensitivity Index (NSI) as a governance metric, and outline a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEthics and Social Impacts of AI · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)