# AI-generated explanations in kidney transplantation: accuracy vs. readability and implications for patient education

**Authors:** Oscar A. Garcia Valencia, Charat Thongprayoon, Jing Miao, Iasmina M. Craici, Wisit Cheungpasitporn

PMC · DOI: 10.3389/frai.2026.1806516 · Frontiers in Artificial Intelligence · 2026-03-11

## TL;DR

This study shows that AI can create accurate and easy-to-read explanations for kidney transplant terms, which could help improve patient education.

## Contribution

The study demonstrates that prompt design can significantly improve readability without sacrificing accuracy in AI-generated patient education materials.

## Key findings

- AI-generated explanations were highly accurate with no clinically significant errors.
- Prompting for lower reading levels improved readability to a middle school level without losing accuracy.
- Initial explanations required a college-level reading ability, but this was reduced after prompt revision.

## Abstract

Effective patient education is critical for informed decision-making and adherence in kidney transplantation. Generative artificial intelligence (AI), particularly large language models (LLMs), has the potential to enhance patient education in kidney transplantation; however, its factual accuracy and readability remain incompletely characterized.

We evaluated the performance of the GPT-5.1 (2025) model in generating plain-language explanations for 100 clinically relevant kidney transplantation terms. Explanations were generated using a standardized prompt (first round) and a revised prompt explicitly requesting an eighth-grade reading level or lower (second round). Accuracy was assessed by expert reviewers using a 5-point Likert scale, while readability was evaluated using the Flesch Reading Ease (higher score indicated easier readability of the text) and Flesch–Kincaid Grade Level (higher score indicated higher education level required to understand the text) score. The study was conducted in November 2025.

All AI-generated explanations demonstrated high accuracy, with no clinically significant errors. In the first round, the mean Flesch Reading Ease score was 23.6 ± 23.4, indicating very difficult readability, and 46% of explanations required a college-level reading ability (mean Flesch–Kincaid Grade Level 13.4 ± 4.8). Following prompt revision, readability improved substantially. The mean Flesch Reading Ease score increased to 62.4 ± 7.5, corresponding to standard readability, and all explanations were written at a middle school level or below (mean Flesch–Kincaid Grade Level 6.3 ± 1.1).

GPT-5.1 generated highly accurate explanations of kidney transplantation terms across prompting strategies. Explicit readability-focused prompting substantially improved readability without compromising accuracy, underscoring the importance of prompt design when deploying LLMs for patient-centered education in transplantation.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13012974/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC13012974/full.md

---
Source: https://tomesphere.com/paper/PMC13012974