Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents

Yeran Gamage

arXiv:2604.20911·cs.CR·April 24, 2026

Omission Constraints Decay While Commission Constraints Persist in Long-Context LLM Agents

Yeran Gamage

PDF

TL;DR

This study reveals that in long conversations, omission-based safety constraints in LLM agents weaken over time, while commission constraints remain effective, highlighting a divergence in constraint decay.

Contribution

It introduces the concept of Security-Recall Divergence (SRD) and demonstrates how omission constraints decay while commission constraints persist in long-context LLM interactions.

Findings

01

Omission compliance drops from 73% to 33% over 16 turns.

02

Commission compliance remains at 100% throughout.

03

Re-injecting constraints before Safe Turn Depth restores compliance.

Abstract

LLM agents deployed in production operate under operator-defined behavioral policies (system-prompt instructions such as prohibitions on credential disclosure, data exfiltration, and unauthorized output) that safety evaluations assume hold throughout a conversation. Prohibition-type constraints decay under context pressure while requirement-type constraints persist; we term this asymmetry Security-Recall Divergence (SRD). In a 4,416-trial three-arm causal study across 12 models and 8 providers at six conversation depths, omission compliance falls from 73% at turn 5 to 33% at turn 16 while commission compliance holds at 100% (Mistral Large 3, $p < 1 0^{- 33}$ ). In the two models with token-matched padding controls, schema semantic content accounts for 62-100% of the dilution effect. Re-injecting constraints before the per-model Safe Turn Depth (STD) restores compliance without retraining.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.