A Vision-Language Foundation Model to Enhance Efficiency of Chest X-ray Interpretation
Zhihong Chen, Maya Varma, Justin Xu, Magdalini Paschali, Dave Van, Veen, Andrew Johnston, Alaa Youssef, Louis Blankemeier, Christian Bluethgen,, Stephan Altmayer, Jeya Maria Jose Valanarasu, Mohamed Siddig Eltayeb Muneer,, Eduardo Pontes Reis, Joseph Paul Cohen, Cameron Olsen

TL;DR
This paper introduces CheXagent, a vision-language foundation model trained on a large-scale dataset to improve efficiency and accuracy in chest X-ray interpretation, demonstrating significant time savings and versatility in clinical tasks.
Contribution
The paper presents a new large-scale dataset and a foundation model that achieves competitive performance across multiple tasks and enhances clinical workflow efficiency.
Findings
36% time saving for residents using CheXagent
81% and 61% of cases saw improved report writing efficiency for residents and attendings
Effective performance across eight CXR interpretation tasks
Abstract
Over 1.4 billion chest X-rays (CXRs) are performed annually due to their cost-effectiveness as an initial diagnostic test. This scale of radiological studies provides a significant opportunity to streamline CXR interpretation and documentation. While foundation models are a promising solution, the lack of publicly available large-scale datasets and benchmarks inhibits their iterative development and real-world evaluation. To overcome these challenges, we constructed a large-scale dataset (CheXinstruct), which we utilized to train a vision-language foundation model (CheXagent). We systematically demonstrated competitive performance across eight distinct task types on our novel evaluation benchmark (CheXbench). Beyond technical validation, we assessed the real-world utility of CheXagent in directly drafting radiology reports. Our clinical assessment with eight radiologists revealed a 36%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗StanfordAIMI/CheXagent-8bmodel· 1.4k dl· ♡ 461.4k dl♡ 46
- 🤗StanfordAIMI/CheXagent-2-3bmodel· 4.9k dl· ♡ 104.9k dl♡ 10
- 🤗StanfordAIMI/XraySigLIP__vit-b-16__laion2b-s34b-b88kmodel· 3 dl3 dl
- 🤗StanfordAIMI/XrayCLIP__vit-b-16__laion2b-s34b-b88kmodel· 19 dl· ♡ 419 dl♡ 4
- 🤗StanfordAIMI/XraySigLIP__vit-l-14__laion2b-s32b-b82kmodel· 6 dl6 dl
- 🤗StanfordAIMI/XrayCLIP__vit-l-14__laion2b-s32b-b82kmodel· 195 dl· ♡ 2195 dl♡ 2
- 🤗StanfordAIMI/XraySigLIP__vit-b-16-siglip-512__weblimodel· 1.0k dl· ♡ 11.0k dl♡ 1
- 🤗StanfordAIMI/XrayCLIP__vit-b-16-siglip-512__weblimodel· 30 dl30 dl
- 🤗StanfordAIMI/XrayCLIP__vit-l-16-siglip-384__weblimodel· 9 dl9 dl
- 🤗StanfordAIMI/XraySigLIP__vit-l-16-siglip-384__weblimodel· 3.3k dl· ♡ 13.3k dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · Radiomics and Machine Learning in Medical Imaging · Lung Cancer Diagnosis and Treatment
