# Quality, Empathy, and Readability of AI Chatbot Responses to the Survivorship Needs of Adolescents and Young Adults With Melanoma: Evaluation Study

**Authors:** Jordan Lily Jafarnia, Priscilla Lynne Haff, Reece Philip Moore, Alyssa Leigh Osheim, Katherine McKenna Riley, Sabrina Zheng, Michael Roth, Madeleine Hines Salge, Kelly Carter Nelson

PMC · DOI: 10.2196/84234 · JMIR Cancer · 2026-03-26

## TL;DR

This study evaluates how well AI chatbots address the needs of young melanoma patients, finding that responses vary in quality and empathy, and are often hard to read.

## Contribution

The study introduces a novel evaluation framework for AI chatbots tailored to the specific psychosocial needs of adolescent and young adult melanoma patients.

## Key findings

- ChatGPT provided the highest quality and empathy scores among chatbots, though with variability.
- Chatbot responses were generally written above the average US reading level, reducing accessibility.
- Question framing significantly influenced chatbot performance, with emotional prompts improving empathy.

## Abstract

Melanoma, a highly aggressive form of skin cancer, is the second most common type of cancer for adolescent and young adult (AYA, ages 15-39 years) patients. AYA patients with melanoma may turn to internet sources, especially artificial intelligence (AI) chatbots, to manage uncertainty about prognosis and treatment.

This study aims to evaluate the quality, empathy, and readability of responses generated by leading AI chatbots when addressing the top unmet needs of AYA patients with melanoma receiving treatment.

Our research team recently surveyed 152 AYA patients with melanoma using the Needs Assessment Service Bridge, a validated instrument that assesses psychosocial needs for AYA patients with cancer. The survey identified the top 5 needs for advanced AYA patients with melanoma receiving treatment. Each need was reframed into a question and brief clinical history, then entered into each chatbot by 5 individuals who cleared their prequestion and postquestion history. Chatbot responses were evaluated to assess information quality (Global Quality Score [GQS] and DISCERN), accessibility and readability (GQS, Flesch Kincaid Grade Level, Flesch Reading Ease), and perceived empathy (Perceived Empathy of Technology Scale [PETS], including domains of Emotional Responsiveness [PETS-ER], Understanding and Trust [PETS-UT]).

Across 75 chatbot responses, ChatGPT achieved the highest average quality (mean GQS 4.42, SD 0.32; mean DISCERN 3.24, SD 0.31) and empathy (mean PETS-ER 5.35, SD 1.85; mean PETS-UT 6.36, SD 1.83), though with greater variability. Copilot produced the lowest quality and empathy scores, while Gemini responses were consistently midrange. PETS-UT exceeded PETS-ER across all models, suggesting stronger cognitive empathy than emotional responsiveness. Readability analysis showed outputs exceeded the average US reading level (mean Flesch Kincaid Grade Level 11.82, SD 1.44; mean FRE 38.60, SD 9.00), limiting accessibility. The most readable responses were found in question 2, which also scored higher in quality and empathy, whereas questions 4 and 5 produced the most complex, difficult-to-read responses corresponding with lower quality and empathy ratings.

AI chatbots can provide moderately accurate and supportive responses to needs of AYA patients with melanoma, but outputs are inconsistent, written above the recommended reading level for health information, and limited in empathy. Question framing strongly influenced chatbot performance, with more emotional prompts drawing greater empathy, and readability aligning with both quality and empathy. Chatbot use in this population should remain adjunctive, with further research needed to standardize quality, improve readability, and enhance empathetic communication.

## Linked entities

- **Diseases:** melanoma (MONDO:0005105)

## Full-text entities

- **Genes:** EREG (epiregulin) [NCBI Gene 2069] {aka EPR, ER, Ep}
- **Diseases:** PETS (MESH:C000719218), dizziness (MESH:D004244), AI (MESH:C538142), skin cancer (MESH:D012878), leukemia (MESH:D007938), cognitive impairment (MESH:D003072), GQS (MESH:D001037), adrenal insufficiency (MESH:D000309), breast (MESH:D061325), Melanoma (MESH:D008545), Cancer (MESH:D009369), cutaneous melanoma (MESH:C562393), confusion (MESH:D003221), fatigue (MESH:D005221)
- **Chemicals:** FKGL (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13020680/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13020680/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/PMC13020680/full.md

---
Source: https://tomesphere.com/paper/PMC13020680