# Use of a Medical Communication Framework to Assess the Quality of Generative Artificial Intelligence Replies to Primary Care Patient Portal Messages: Content Analysis

**Authors:** Natalie S Lee, Nathan Richards, Jodi Grandominico, Robert M Cronin, Amanda K Hendricks, Ravi S Tripathi, Daniel E Jonas

PMC · DOI: 10.2196/71966 · JMIR Formative Research · 2025-07-31

## TL;DR

This study evaluates how well generative AI can respond to patient messages in primary care, finding that while AI can show strengths in some communication areas, it often struggles with information accuracy and emotional responses.

## Contribution

The paper introduces a novel evaluation of GenAI responses in primary care using a medical communication framework to identify strengths and limitations.

## Key findings

- GenAI showed strengths in rapport building and facilitating next steps but struggled with information delivery and emotional responses.
- Communication strengths outnumbered limitations in rapport building and next steps, but limitations were more frequent in other domains.
- About 26% of responses had only strengths, 27% had only limitations, and 46% had both.

## Abstract

There is growing interest in applying generative artificial intelligence (GenAI) to respond to electronic patient portal messages, particularly in primary care where message volumes are highest. However, evaluations of GenAI as an inbox communication tool are limited. Qualitative analysis of when and how often GenAI responses achieve communication goals can inform estimates of impact and guide continuous improvement.

This study aims to evaluate GenAI responses to primary care messages using a medical communication framework.

This was a descriptive quality improvement study of 201 GenAI replies to a purposively sampled, diverse pool of real primary care patient messages in a large midwestern academic medical center. Two physician reviewers (NSL and NR) used a hybrid deductive-inductive approach to qualitatively identify and define themes, guided by constructs from the “best practice” medical communication framework. After achieving thematic saturation, the reviewers assessed the presence or absence of identified communication themes, both independently and collaboratively. Discrepant observations were reconciled via discussion. Frequencies of identified themes were tallied.

Themes in strengths and limitations emerged across 5 communication domains. In the domain of rapport building, expressing respect and restating key phrases were strengths, while inappropriate or inadequate rapport building statements were limitations. For information gathering, questions that built toward a plan or elicited patient needs were strengths, while questions that were out of place or redundant were limitations. For information delivery, accurate content delivered clearly and professionally was a strength, but delivery of inaccurate content was an observed limitation. GenAI responses could facilitate next steps by outlining choices or providing instruction, but sometimes those next steps were inappropriate or premature. Finally, in responding to emotion, strengths were that emotions were named and validated, while inadequate or absent acknowledgment of emotion was a limitation. Overall, 26.4% (53/201) of all messages displayed communication strengths without limitations, 27.4% (55/201) had limitations without strengths, and the remaining 46.3% (93/201) had both. Strengths outnumbered limitations in rapport building (87/201, 43.3% vs 35/201, 17.4%) and facilitating next steps (73/201, 36.3% vs 39/201, 19.4%). Limitations outnumbered strengths in the remaining domains of information delivery (89/201, 44.3% vs 43/201, 21.4%), information gathering (60/201, 29.9% vs 43/201, 21.4%), and responding to emotion (7/201, 8.5% vs 9/201, 4.5%).

GenAI response quality on behalf of primary care physicians and advanced practice providers may vary by communication function. Expressions of respect or descriptions of common next steps may be appropriate, but gathering and delivering appropriate information, or responding to emotion, may be limited. While communication standards were often met, they were also often compromised. Understanding these strengths and limitations can inform decisions about whether, when, and how to apply GenAI as a tool for primary care inbox communication.

## Full-text entities

- **Diseases:** allergies (MESH:D004342), burnout (MESH:D002055), PCP (MESH:D011020), GenAI (MESH:C538142)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12313158/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12313158/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12313158/full.md

---
Source: https://tomesphere.com/paper/PMC12313158