Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports
Qingqing Zhu, Xiuying Chen, Qiao Jin, Benjamin Hou, Tejas Sudharshan, Mathai, Pritam Mukherjee, Xin Gao, Ronald M Summers, Zhiyong Lu

TL;DR
This paper introduces a novel evaluation method for radiology reports that combines radiologist expertise with advanced LLM techniques, significantly improving the accuracy and robustness of AI report assessments.
Contribution
It proposes an innovative approach integrating radiologist standards with LLMs using ICIL and CoT reasoning, surpassing existing metrics in report evaluation accuracy.
Findings
GPT-4 (5-shot) outperforms METEOR by 0.19 in evaluation score
Regressed GPT-4 aligns more closely with radiologist judgments
The method demonstrates robustness validated through iterative testing
Abstract
In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4 1. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiology practices and education · Radiation Dose and Imaging
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Cosine Annealing · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Absolute Position Encodings · Transformer · Linear Warmup With Cosine Annealing
