How Well Can Modern LLMs Act as Agent Cores in Radiology Environments?
Qiaoyu Zheng, Chaoyi Wu, Pengcheng Qiu, Lisong Dai, Ya Zhang, Yanfeng, Wang, Weidi Xie

TL;DR
This paper introduces RadA-BenchPlat, a comprehensive evaluation platform for large language models acting as agent cores in radiology, revealing current capabilities and limitations in complex clinical tasks and tool integration.
Contribution
The paper presents RadA-BenchPlat, a new benchmarking platform with synthetic radiology data, and demonstrates how prompt engineering and automated tool building can improve LLM performance in radiology tasks.
Findings
Claude-3.7-Sonnet achieves 67.1% task completion in routine settings.
Prompt engineering strategies improve complex task performance by 48.2%.
Automated tool building achieves 65.4% success rate.
Abstract
We introduce RadA-BenchPlat, an evaluation platform that benchmarks the performance of large language models (LLMs) act as agent cores in radiology environments using 2,200 radiologist-verified synthetic patient records covering six anatomical regions, five imaging modalities, and 2,200 disease scenarios, resulting in 24,200 question-answer pairs that simulate diverse clinical situations. The platform also defines ten categories of tools for agent-driven task solving and evaluates seven leading LLMs, revealing that while models like Claude-3.7-Sonnet can achieve a 67.1% task completion rate in routine settings, they still struggle with complex task understanding and tool coordination, limiting their capacity to serve as the central core of automated radiology systems. By incorporating four advanced prompt engineering strategies--where prompt-backpropagation and multi-agent collaboration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced X-ray and CT Imaging · Radiology practices and education
