# Comparison of Emotional Content in Text Responses From Physicians and AI Chatbots to Patient Health Queries: Cross-Sectional Study

**Authors:** Daniel T Burns, Channing Bice, Paul E Johnson, Nicholas Chia, Timothy Robinson

PMC · DOI: 10.2196/85516 · Journal of Medical Internet Research · 2026-03-06

## TL;DR

This study compares the emotional tone, readability, and use of disclaimers in health-related responses from physicians and two AI chatbots.

## Contribution

The study introduces a novel comparison of emotional content and readability in AI and physician health responses.

## Key findings

- Chatbot responses were significantly longer and more difficult to read than physician responses.
- Gemini was more likely to include disclaimers and showed a wider range of emotions compared to ChatGPT.
- Physician responses had the highest readability and were more emotionally neutral compared to chatbots.

## Abstract

Surveys show that many people are willing to use generative artificial intelligence (AI) for health questions. Prior research has largely focused on chatbot accuracy, with some studies finding that both physicians and consumers overwhelmingly prefer chatbot-generated text over physician responses.

This study aimed to characterize and compare the emotional content of responses from physicians and 2 AI chatbots (OpenAI’s ChatGPT and Google’s Gemini) and to assess differences in reading level and use of medical disclaimers.

A public, patient-deidentified telehealth website was used to compile 100 physician-answered questions. The same questions were posed to both chatbots between May 18 and 19, 2025. Two coders classified the emotional content of each sentence using a predefined codebook and reviewed for agreement. Emotions were ranked as primary, secondary, and tertiary by the proportion of sentences classified as each emotion per response. Multinomial logistic regression compared emotional rankings using physician responses as the reference. Word count, Flesch Reading Ease, and Flesch-Kincaid Grade Level were analyzed via ANOVA with the Tukey honestly significant difference test. Disclaimer use was compared between chatbots using a χ2 test.

Primary emotions were overwhelmingly neutral, except for one response from each chatbot in which anger was primary. For secondary emotions, the odds ratio of hope was 80.28% (95% CI 37.71%-93.76%) lower for ChatGPT, while the odds ratio of fear was 3.29 (95% CI 1.44-7.49) times higher for Gemini. For tertiary emotions, the odds ratio of compassion was 1.94 (95% CI 1.06-3.54) times higher, and the odds ratio of having no tertiary emotion was 84.33% (95% CI 64.72%-93.04%) lower for Gemini. Gemini responses averaged 889.1 (SD 305.7) words, ChatGPT 476.5 (SD 109.5), and physicians 193.5 (SD 113.6). Gemini had the lowest average Flesch Reading Ease score at 39.9 (SD 8.8), followed by ChatGPT at 45.8 (SD 12.8), while physicians had the highest at 51.9 (SD 13.6). Gemini had the highest average Flesch-Kincaid Grade Level at 11.3 (SD 1.5), followed by ChatGPT at 9.9 (SD 1.9), and physicians at 9.2 (SD 2.4). Gemini was significantly more likely to include a disclaimer than ChatGPT (χ21=49.2; P<.001).

Chatbot responses were significantly (P<.001) longer and more difficult to read than physician responses and were more likely to contain a wider range of emotions. Qualitatively, chatbot responses were more varied in their presentation as well as in the breadth of the emotions themselves. The findings of this study could be used to inform more emotionally connected physician responses to patient message queries.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13005063/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC13005063/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC13005063/full.md

---
Source: https://tomesphere.com/paper/PMC13005063