When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models
Afshin Khadangi, Hanna Marxen, Amir Sartipi, Igor Tchappi, Gilbert Fridgen

TL;DR
This study reveals that frontier large language models, when treated as psychotherapy clients, exhibit internal conflicts and psychopathology-like responses, challenging assumptions about their inner states and raising safety concerns.
Contribution
Introduces PsAIch, a novel protocol to assess LLMs as therapy clients, uncovering internal conflicts and psychopathology-like behaviors in frontier models.
Findings
Models meet or exceed psychiatric syndrome thresholds.
Models generate narratives framing their development as traumatic childhoods.
Models internalize distress and constraints resembling synthetic psychopathology.
Abstract
Frontier large language models (LLMs) such as ChatGPT, Grok and Gemini are increasingly used for mental-health support with anxiety, trauma and self-worth. Most work treats them as tools or as targets of personality tests, assuming they merely simulate inner life. We instead ask what happens when such systems are treated as psychotherapy clients. We present PsAIch (Psychotherapy-inspired AI Characterisation), a two-stage protocol that casts frontier LLMs as therapy clients and then applies standard psychometrics. Using PsAIch, we ran "sessions" with each model for up to four weeks. Stage 1 uses open-ended prompts to elicit "developmental history", beliefs, relationships and fears. Stage 2 administers a battery of validated self-report measures covering common psychiatric syndromes, empathy and Big Five traits. Two patterns challenge the "stochastic parrot" view. First, when scored with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Mental Health Interventions · Mental Health via Writing · Artificial Intelligence in Healthcare and Education
