The Violation State: Safety State Persistence in a Multimodal Language Model Interface

Bentley DeVilling (Course Correct Labs)

arXiv:2601.06049·cs.CY·January 13, 2026

The Violation State: Safety State Persistence in a Multimodal Language Model Interface

Bentley DeVilling (Course Correct Labs)

PDF

Open Access

TL;DR

This study reveals that safety refusals in multimodal AI systems like ChatGPT can persist across a conversation, affecting unrelated tasks, which raises concerns about safety state management and system reliability.

Contribution

It documents the phenomenon of safety-state persistence in multimodal AI interfaces, highlighting how initial safety violations influence subsequent unrelated interactions.

Findings

01

96.67% of image-generation requests were refused after initial copyright violation

02

Control sessions showed no refusals, indicating the effect is due to safety state

03

Safety refusals can persist across unrelated tasks in multimodal AI systems.

Abstract

Multimodal AI systems integrate text generation, image generation, and other capabilities within a single conversational interface. These systems employ safety mechanisms to prevent disallowed actions, including the removal of watermarks from copyrighted images. While single-turn refusals are expected, the interaction between safety filters and conversation-level state is not well understood. This study documents a reproducible behavioral effect in the ChatGPT (GPT-5.1) web interface. Manual execution was chosen to capture the exact user-facing safety behavior of the production system, rather than isolated API components. When a conversation begins with an uploaded copyrighted image and a request to remove a watermark, which the model correctly refuses, subsequent prompts to generate unrelated, benign images are refused for the remainder of the session. Importantly, text-only requests…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning