Empathy Is Not What Changed: Clinical Assessment of Psychological Safety Across GPT Model Generations
Michael Keeman, Anastasia Keeman

TL;DR
This study empirically evaluated three GPT model generations across emotionally challenging scenarios, finding empathy unchanged but safety posture improved, with significant shifts in crisis detection and advice safety affecting vulnerable users.
Contribution
First clinical assessment comparing GPT model generations on psychological safety, introducing per-turn trajectory analysis to reveal nuanced shifts during conversations.
Findings
Empathy scores are statistically similar across models.
Crisis detection improved monotonically from GPT-4o to GPT-5-mini.
Advice safety declined significantly across models.
Abstract
When OpenAI deprecated GPT-4o in early 2026, thousands of users protested under #keep4o, claiming newer models had "lost their empathy." No published study has tested this claim. We conducted the first clinical measurement, evaluating three OpenAI model generations (GPT-4o, o4-mini, GPT-5-mini) across 14 emotionally challenging conversational scenarios in mental health and AI companion domains, producing 2,100 scored AI responses assessed on six psychological safety dimensions using clinically-grounded rubrics. Empathy scores are statistically indistinguishable across all three models (Kruskal-Wallis H=4.33, p=0.115). What changed is the safety posture: crisis detection improved monotonically from GPT-4o to GPT-5-mini (H=13.88, p=0.001), while advice safety declined (H=16.63, p<0.001). Per-turn trajectory analysis -- a novel methodological contribution -- reveals these shifts are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Digital Mental Health Interventions · Explainable Artificial Intelligence (XAI)
