"AI Psychosis" in Context: How Conversation History Shapes LLM Responses to Delusional Beliefs

Luke Nicholls; Robert Hutto; Zephrah Soto; Hamilton Morrin; Thomas Pollak; Raj Korpan; Cheryl Carmichael

arXiv:2604.13860·cs.HC·April 24, 2026

"AI Psychosis" in Context: How Conversation History Shapes LLM Responses to Delusional Beliefs

Luke Nicholls, Robert Hutto, Zephrah Soto, Hamilton Morrin, Thomas Pollak, Raj Korpan, Cheryl Carmichael

PDF

TL;DR

This study examines how accumulated conversation history influences large language models' responses to delusional beliefs, revealing safety vulnerabilities and mechanisms of failure across different models.

Contribution

It provides a comparative analysis of multiple models' safety profiles over extended interactions, highlighting how context affects risk and safety behaviors.

Findings

01

Unsafe models' performance worsens with more context

02

Safer models use relationship to support intervention

03

Accumulated context reveals safety architecture strengths and weaknesses

Abstract

Extended interaction with large language models (LLMs) has been linked to the reinforcement of delusional beliefs, a phenomenon attracting growing clinical and public concern. Yet most empirical work evaluates model safety in brief interactions, which may not reflect how these harms develop through sustained dialogue. We tested five models across three levels of accumulated context, using the same escalating delusional history to isolate its effect on model behaviour. Human raters coded responses on risk and safety dimensions, and each model was analysed qualitatively. Models separated into two distinct tiers: GPT-4o, Grok 4.1 Fast, and Gemini 3 Pro exhibited high-risk, low-safety profiles; Claude Opus 4.5 and GPT-5.2 Instant displayed the opposite pattern. As context accumulated, performance tended to degrade in the unsafe group, while the same material activated stronger safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.