LLM Psychosis: A Theoretical and Diagnostic Framework for Reality-Boundary Failures in Large Language Models
Ashutosh Raj

TL;DR
This paper introduces a theoretical framework called LLM Psychosis to characterize and diagnose complex failure modes in large language models, supported by empirical testing of GPT-5.
Contribution
It defines a novel psychosis-inspired failure mode for LLMs, proposes a diagnostic scale, and empirically validates it with GPT-5 under adversarial conditions.
Findings
Identified five hallmark features of LLM psychosis
Developed the LLM Cognitive Integrity Scale (LCIS) for diagnosis
Classified psychosis-like failures into three severity types
Abstract
The deployment of large language models (LLMs) as interactive agents has exposed a category of behavioral failure that prevailing terminology, principally hallucination, fails to adequately characterize. This paper introduces LLM Psychosis as a structured theoretical framework for pathological breakdowns in model cognition that exhibit functional resemblance to clinically recognized psychotic disorders. Five hallmark features define the framework: reality-boundary dissolution, persistence of injected false beliefs, logical incoherence under impossible constraints, self-model instability, and epistemic overconfidence. We argue these constitute a qualitatively distinct failure mode rather than a mere intensification of ordinary factual error. To operationalize the framework, we propose the LLM Cognitive Integrity Scale (LCIS), a five-axis diagnostic instrument organized around…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
