Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation
Ryutaro Tanno, David G.T. Barrett, Andrew Sellergren, Sumedh Ghaisas,, Sumanth Dathathri, Abigail See, Johannes Welbl, Karan Singhal, Shekoofeh, Azizi, Tao Tu, Mike Schaekermann, Rhys May, Roy Lee, SiWai Man, Zahra Ahmed,, Sara Mahdavi, Yossi Matias, Joelle Barral, Ali Eslami

TL;DR
This study develops a state-of-the-art AI system for radiology report generation, demonstrating that AI can produce reports preferred or deemed equivalent to human reports in clinical settings, and explores clinician-AI collaboration.
Contribution
The paper introduces Flamingo-CXR, a novel vision-language model fine-tuned for radiology report generation and evaluates its clinical quality through radiologist assessments, highlighting AI-human complementarity.
Findings
AI reports preferred in over 60% of cases by radiologists.
AI-generated reports often contain location and finding errors.
Clinician-AI collaboration improves report quality, with 80% of inpatient reports deemed equivalent or better.
Abstract
Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment. The worldwide shortage of radiologists, however, restricts access to expert care and imposes heavy workloads, contributing to avoidable errors and delays in report delivery. While recent progress in automated report generation with vision-language models offer clear potential in ameliorating the situation, the path to real-world adoption has been stymied by the challenge of evaluating the clinical quality of AI-generated reports. In this study, we build a state-of-the-art report generation system for chest radiographs, , by fine-tuning a well-known vision-language foundation model on radiology data. To evaluate the quality of the AI-generated reports, a group of 16 certified radiologists provide detailed evaluations of AI-generated and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare and Education
