# Comparative Quality Assessment of Artificial Intelligence in Patient Education on Platelet-Rich Plasma (PRP) Therapy

**Authors:** Jonas Krueckel, Dominik Szymski, Nura Ahmad, David Schiffelholz, Johannes Weber, Siska Buchhorn, Tomas Buchhorn, Kai Fehske, Siegmund Lang, Volker Alt, Franz Hilber

PMC · DOI: 10.3390/jpm16030173 · Journal of Personalized Medicine · 2026-03-23

## TL;DR

This study compares how well AI tools like ChatGPT and Google Gemini explain PRP therapy to patients, finding that they can be helpful but still lack detail and personalization.

## Contribution

The study evaluates the quality of AI-generated patient education on PRP therapy using orthopedic surgeon ratings and identifies model-specific strengths and weaknesses.

## Key findings

- ChatGPT-3.5 outperformed ChatGPT-4 and Google Gemini in overall quality of responses.
- All models lacked sufficient detail in their answers.
- Gemini scored lower in empathy and comprehensiveness compared to ChatGPT-4.

## Abstract

Background: Platelet-rich plasma (PRP) therapy is increasingly used for musculoskeletal conditions, yet patients seeking supplementary information online encounter resources of variable quality. Large language models (LLMs) such as ChatGPT and Google Gemini may support patient education, but their performance in answering common patient questions about PRP therapy has not been well characterized. Methods: This study compared the quality of responses generated by ChatGPT-4, ChatGPT-3.5, and Google Gemini to common PRP-related patient questions. Ten frequently asked PRP-related questions were identified through a structured search of online sources, PubMed, Google Trends, and AI-assisted query generation. Each question was submitted to the three LLMs using a standardized prompt designed to elicit clear and empathetic responses. Five orthopedic surgeons, blinded to model identity, assessed each answer using a previously published four-tier rating framework. Secondary metrics included exhaustiveness, clarity, empathy, and response length. Results: All models produced mostly satisfactory answers. ChatGPT-3.5 received the highest proportion of excellent ratings (70%), compared with 40% for ChatGPT-4 and 22% for Gemini, and outperformed both models in overall quality. The most common limitation across models was insufficient detail. ChatGPT-4 and Gemini performed similarly in several categories, although Gemini was rated lower in empathy and comprehensiveness. Overall differences between models were statistically significant. Conclusions: Commonly available LLMs were able to provide mostly satisfactory responses to patient questions about PRP. However, important limitations remained, particularly with respect to detail and individualization. These tools may support initial patient information-seeking, but they should complement rather than replace expert medical counseling.

## Full-text entities

- **Diseases:** spine (MESH:D016135), musculoskeletal conditions (MESH:D009140), injury to (MESH:D014947), PRP (MESH:D000080203), tendinopathies (MESH:D052256), LLM (MESH:D007806), osteoarthritis (MESH:D010003), degenerative disorders (MESH:D019636)
- **Chemicals:** Gemini (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13028038/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13028038/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC13028038/full.md

---
Source: https://tomesphere.com/paper/PMC13028038