# Using Natural Language Processing to Explore Patient Perspectives on AI Avatars in Support Materials for Patients With Breast Cancer: Survey Study

**Authors:** Eleanor Cheese, Raouef Ahmed Bichoo, Kartikae Grover, Dorin Dumitru, Alexandros Zenonos, Joanne Groark, Douglas Gibson, Rebecca Pope

PMC · DOI: 10.2196/70971 · Journal of Medical Internet Research · 2025-06-20

## TL;DR

This study used NLP to analyze patient feedback on AI-generated educational videos for breast cancer patients, finding overall positive reactions but some concerns about the AI avatar.

## Contribution

The study demonstrates how NLP techniques can extract meaningful insights from patient feedback on AI-generated educational content.

## Key findings

- 81% of patient responses were positive or neutral, with negative feedback mainly about the AI avatar.
- NLP methods like BERTopic and sentiment analysis revealed key topics and sentiments in patient feedback.
- AI-generated videos show potential to replace traditional educational materials but need improvements in avatar design.

## Abstract

Having well-informed patients is crucial to enhancing patient satisfaction, quality of life, and health outcomes, which in turn optimizes health care use. Traditional methods of delivering information, such as booklets and leaflets, are often ineffective and can overwhelm patients. Educational videos represent a promising alternative; however, their production typically requires significant time and financial resources. Video production using generative artificial intelligence (AI) technology may provide a solution to this problem.

This study aimed to use natural language processing (NLP) to understand free-text patient feedback on 1 of 7 AI-generated patient educational videos created in collaboration with Roche UK and the Hull University Teaching Hospitals NHS Trust breast cancer team, titled “Breast Cancer Follow Up Programme.”

A survey was sent to 400 patients who had completed the breast cancer treatment pathway, and 98 (24.5%) free-text responses were received for the question “Any comments or suggestions to improve its [the video’s] contents?” We applied and evaluated different NLP machine learning techniques to draw insights from these unstructured data, namely sentiment analysis, topic modeling, summarization, and term frequency–inverse document frequency word clouds.

Sentiment analysis showed that 81% (79/98) of the responses were positive or neutral, while negative comments were predominantly related to the AI avatar. Topic modeling using BERTopic with k-means clustering was found to be the most effective model and identified 4 key topics: the breast cancer treatment pathway, video content, the digital avatar or narrator, and short responses with little or no content. The term frequency–inverse document frequency word clouds indicated positive sentiment about the treatment pathway (eg, “reassured” and “faultless”) and video content (eg, “informative” and “clear”), whereas the AI avatar was often described negatively (eg, “impersonal”). Summarization using the text-to-text transfer transformer model effectively created summaries of the responses by topic.

This study demonstrates the success of NLP techniques in efficiently generating insights into patient feedback related to generative AI educational content. Combining NLP methods resulted in clear visuals and insights, enhancing the understanding of patient feedback. Analysis of free-text responses provided clinicians at Hull University Teaching Hospitals NHS Trust with deeper insights than those obtained from quantitative Likert scale responses alone. Importantly, the results validate the use of generative AI in creating patient educational videos, highlighting its potential to address the challenges of costly video production and the limitations of traditional, often overwhelming educational leaflets. Despite the positive overall feedback, negative comments focused on the technical aspects of the AI avatar, indicating areas for improvement. We advocate that patients who receive AI avatar explanations are counseled that this technology is intended to supplement, not replace, human health care interactions. Future investigations are needed to confirm the ongoing effectiveness of these educational tools.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** Breast Cancer (MESH:D001943)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12228011/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12228011/full.md

## References

121 references — full list in the complete paper: https://tomesphere.com/paper/PMC12228011/full.md

---
Source: https://tomesphere.com/paper/PMC12228011