Root Cause Analysis of Radiation Oncology Incidents Using Large Language Models

Yuntao Wang; Mariluz De Ornelas; Matthew T. Studenski; Elizabeth Bossart; Siamak P. Najad-Davarani; Yunze Yang

arXiv:2508.17201·physics.med-ph·December 19, 2025

Root Cause Analysis of Radiation Oncology Incidents Using Large Language Models

Yuntao Wang, Mariluz De Ornelas, Matthew T. Studenski, Elizabeth Bossart, Siamak P. Najad-Davarani, Yunze Yang

PDF

TL;DR

This study evaluates the reasoning abilities of large language models in performing root cause analysis of radiation oncology incidents, demonstrating their potential to support patient safety and quality improvement efforts.

Contribution

It introduces a systematic assessment of multiple LLMs' effectiveness in RCA tasks within radiation oncology using real incident reports.

Findings

01

GPT-4o achieved highest semantic similarity

02

Gemini 2.5 Pro had highest recall and accuracy

03

LLMs showed promising performance with some hallucinations

Abstract

Purpose To evaluate the reasoning capabilities of large language models (LLMs) in performing root cause analysis (RCA) of radiation oncology incidents using narrative reports from the Radiation Oncology Incident Learning System (RO-ILS), and to assess their potential utility in supporting patient safety efforts. Methods and Materials Four LLMs, Gemini 2.5 Pro, GPT-4o, o3, and Grok 3, were prompted with the 'Background and Incident Overview' sections of 19 public RO-ILS cases. Using a standardized prompt based on AAPM RCA guidelines, each model was instructed to identify root causes, lessons learned, and suggested actions. Outputs were assessed using semantic similarity metrics (cosine similarity via Sentence Transformers), semi-subjective evaluations (precision, recall, F1-score, accuracy, hallucination rate, and four performance criteria: relevance, comprehensiveness, justification,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.