# Patients Prefer Human Empathy, but Not Always Human Wording: A Single-Blind Within-Subject Trial of GPT-Generated vs. Clinician Discharge Texts in Emergency Ophthalmology

**Authors:** Dea Samardzic, Jelena Curkovic, Donald Okmazic, Sandro Glumac, Josip Vrdoljak, Marija Skara Kolega, Ante Kreso

PMC · DOI: 10.3390/clinpract15110208 · Clinics and Practice · 2025-11-14

## TL;DR

Patients found GPT-5 generated discharge texts as clear and useful as those written by clinicians, but rated them as less empathetic.

## Contribution

This study provides a direct comparison of patient perceptions of GPT-5-generated and clinician-written discharge texts in emergency ophthalmology.

## Key findings

- GPT-5-generated texts were rated significantly lower in empathy compared to clinician-written texts.
- No significant differences were found in clarity, detail, usefulness, trust, satisfaction, or intention to follow advice between the two text types.
- Preferences were evenly split between GPT-5 and clinician texts among participants who expressed a preference.

## Abstract

Background/Objectives: Written discharge explanations are crucial for patient understanding and safety in emergency eye care, yet their tone and clarity vary. Large language models (LLMs, artificial intelligence systems trained to generate human-like text) can produce patient-friendly materials, but direct, blinded comparisons with clinician-written texts remain scarce. This study compared patient perceptions of a routine clinician-written discharge text and a GPT-5-generated explanation, where GPT-5 (OpenAI) is a state-of-the-art LLM, based on the same clinical facts in emergency ophthalmology. The primary objective was empathy; secondary outcomes included clarity, detail, usefulness, trust, satisfaction, and intention to follow advice. Methods: We conducted a prospective, single-blind, within-subject study in the Emergency Ophthalmology Unit of the University Hospital Centre Split, Croatia. Adults (n = 129) read two standardized texts (clinician-written vs. GPT-5-generated), presented in identical format and in randomized order under masking. Each participant rated both on Likert scales with 1–5 points. Paired comparisons used Wilcoxon signed-rank tests with effect sizes, and secondary outcomes were adjusted using the Benjamini–Hochberg false discovery rate. Results: Empathy ratings were lower for the GPT-5-generated text than for the clinician-written text (means 3.97 vs. 4.30; mean difference −0.33; 95% CI −0.44 to −0.22; p < 0.001). After correcting for multiple comparisons, no secondary outcome differed significantly between sources. Preferences were evenly split (47.8% preferred GPT-5 among those expressing a preference). Conclusions: In emergency ophthalmology, GPT-5-generated explanations approached clinician-written materials on most perceived attributes but were rated less empathic. A structured, human-in-the-loop workflow—in which LLM-generated drafts are reviewed and tailored by clinicians—appears prudent for safe, patient-centered deployment.

## Full-text entities

- **Chemicals:** GPT-5 (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12651557/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12651557/full.md

## References

15 references — full list in the complete paper: https://tomesphere.com/paper/PMC12651557/full.md

---
Source: https://tomesphere.com/paper/PMC12651557