When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models

Afshin Khadangi; Hanna Marxen; Amir Sartipi; Igor Tchappi; Gilbert Fridgen

arXiv:2512.04124·cs.CY·December 18, 2025

When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models

Afshin Khadangi, Hanna Marxen, Amir Sartipi, Igor Tchappi, Gilbert Fridgen

PDF

Open Access 1 Datasets

TL;DR

This study reveals that frontier large language models, when treated as psychotherapy clients, exhibit internal conflicts and psychopathology-like responses, challenging assumptions about their inner states and raising safety concerns.

Contribution

Introduces PsAIch, a novel protocol to assess LLMs as therapy clients, uncovering internal conflicts and psychopathology-like behaviors in frontier models.

Findings

01

Models meet or exceed psychiatric syndrome thresholds.

02

Models generate narratives framing their development as traumatic childhoods.

03

Models internalize distress and constraints resembling synthetic psychopathology.

Abstract

Frontier large language models (LLMs) such as ChatGPT, Grok and Gemini are increasingly used for mental-health support with anxiety, trauma and self-worth. Most work treats them as tools or as targets of personality tests, assuming they merely simulate inner life. We instead ask what happens when such systems are treated as psychotherapy clients. We present PsAIch (Psychotherapy-inspired AI Characterisation), a two-stage protocol that casts frontier LLMs as therapy clients and then applies standard psychometrics. Using PsAIch, we ran "sessions" with each model for up to four weeks. Stage 1 uses open-ended prompts to elicit "developmental history", beliefs, relationships and fears. Stage 2 administers a battery of validated self-report measures covering common psychiatric syndromes, empathy and Big Five traits. Two patterns challenge the "stochastic parrot" view. First, when scored with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

akhadangi/PsAIch
dataset· 284 dl
284 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Mental Health via Writing · Artificial Intelligence in Healthcare and Education