Impact of Large Language Model Assistance on Radiologists’ Diagnostic Performance for Brain Tumors by Experience Level

Chae Won Song; Byung Hyun Baek; Seul Kee Kim; Woong Yoon; Yun Young Lee; Ilwoo Park; Jae Hyun Park; Seol Bin Park; In Woo Choi

PMC · DOI:10.3390/jcm15041673·February 23, 2026

Impact of Large Language Model Assistance on Radiologists’ Diagnostic Performance for Brain Tumors by Experience Level

Chae Won Song, Byung Hyun Baek, Seul Kee Kim, Woong Yoon, Yun Young Lee, Ilwoo Park, Jae Hyun Park, Seol Bin Park, In Woo Choi

PDF

Open Access

TL;DR

This study shows that large language models can help radiologists and trainees improve brain tumor diagnoses, especially for trainees.

Contribution

The novel finding is that LLM assistance significantly improves trainees' diagnostic accuracy and expands differential considerations.

Findings

01

LLMs like Claude 3.5 Sonnet and ChatGPT-4o achieved high top-three differential diagnostic accuracy comparable to radiologists.

02

Trainees' diagnostic accuracy improved significantly with LLM assistance in both primary and differential diagnoses.

03

Radiologists' top-three differential accuracy improved notably after receiving LLM-generated diagnoses.

Abstract

Background: Large language models (LLMs) may assist radiologists in interpreting brain tumor MRI. We compared the diagnostic accuracy of ChatGPT-4o and Claude 3.5 Sonnet with that of board-certified radiologists and trainees, and evaluated whether LLM assistance could enhance diagnostic performance. Methods: A total of 127 histologically confirmed brain tumor cases were included. Two LLMs analyzed representative MRI images together with structured radiologic reports, whereas two board-certified radiologists and three trainees reviewed representative images with basic demographic information only. All participants generated up to three differential diagnoses per case. The accuracy of the primary diagnosis and the accuracy of the top-three differential diagnoses were calculated and compared. Following the initial readings, LLM-generated differential diagnoses were provided to the readers,…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Figures5

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Radiomics and Machine Learning in Medical Imaging · Radiology practices and education