Large language models for patient education prior to interventional radiology procedures: a comparative study
Bogdan Levita, Semil Eminovic, Willie Magnus Lüdemann, Dirk Schnapauff, Robin Schmidt, Anna-Maria Haack, Andrea Dell’Orco, Jawed Nawabi, Tobias Penzkofer

TL;DR
This study compares how well four large language models can answer patient questions about specific interventional radiology procedures, finding that some models perform well enough to potentially aid patient education.
Contribution
The study evaluates LLMs for patient education in interventional radiology and identifies performance differences across models and procedures.
Findings
DeepSeek-V3 and ChatGPT-4o outperformed OpenBioLLM-8b and BioMistral-7b in answering questions about interventional radiology procedures.
Preparation/Planning was the only category without significant differences across all models and procedures.
LLMs like DeepSeek-V3 and ChatGPT-4o show potential to enhance patient education but cannot replace clinical consultations yet.
Abstract
This study evaluates four large language models’ (LLMs) ability to answer common patient questions preceding transarterial periarticular embolization (TAPE), computed tomography (CT)-guided high-dose-rate (HDR) brachytherapy, and bleomycin electrosclerotherapy (BEST). The goal is to evaluate their potential to enhance clinical workflows and patient comprehension, while also assessing associated risks. Thirty-five TAPE, 34 CT-HDR brachytherapy, and 36 BEST related questions were presented to ChatGPT-4o, DeepSeek-V3, OpenBioLLM-8b, and BioMistral-7b. The LLM-generated responses were independently assessed by two board-certified radiologists. Accuracy was rated on a 5-point Likert scale. Statistics compared LLM performance across question categories for patient-education suitability. DeepSeek-V3 attained the highest mean scores for BEST [4.49 (± 0.77)] and CT-HDR [4.24 (± 0.81)] and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging · Radiology practices and education
