Exploring the Boundaries of GPT-4 in Radiology
Qianchu Liu, Stephanie Hyland, Shruthi Bannur, Kenza Bouzid, Daniel C., Castro, Maria Teodora Wetscherek, Robert Tinn, Harshita Sharma, Fernando, P\'erez-Garc\'ia, Anton Schwaighofer, Pranav Rajpurkar, Sameer Tajdin Khanna,, Hoifung Poon, Naoto Usuyama, Anja Thieme

TL;DR
This paper evaluates GPT-4's performance on radiology report tasks, showing it often surpasses or matches specialized models, with strong zero-shot results and potential for clinical application.
Contribution
It demonstrates GPT-4's competitive performance in radiology text tasks, highlighting its zero-shot capabilities and potential as a generalist model in medical NLP.
Findings
GPT-4 outperforms SOTA models in temporal sentence similarity and natural language inference.
GPT-4 matches supervised models in findings summarisation with example prompting.
Error analysis shows GPT-4 has sufficient radiology knowledge with occasional nuanced errors.
Abstract
The recent success of general-domain large language models (LLMs) has significantly changed the natural language processing paradigm towards a unified foundation model across domains and applications. In this paper, we focus on assessing the performance of GPT-4, the most capable LLM so far, on the text-based applications for radiology reports, comparing against state-of-the-art (SOTA) radiology-specific models. Exploring various prompting strategies, we evaluated GPT-4 on a diverse range of common radiology tasks and we found GPT-4 either outperforms or is on par with current SOTA radiology models. With zero-shot prompting, GPT-4 already obtains substantial gains ( 10% absolute improvement) over radiology models in temporal sentence similarity classification (accuracy) and natural language inference (). For tasks that require learning dataset-specific style or schema…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Dense Connections · Absolute Position Encodings · Adam · Label Smoothing · Residual Connection
