Utility of Multimodal Large Language Models in Analyzing Chest X-ray with Incomplete Contextual Information
Choonghan Kim, Seonhee Cho, Joo Heung Yoon

TL;DR
This study evaluates how multimodal large language models, which incorporate both text and images, can enhance the accuracy and robustness of chest X-ray report analysis, especially when textual information is incomplete.
Contribution
It demonstrates that multimodal LLMs significantly improve performance over text-only models in analyzing chest radiographs with incomplete data.
Findings
Multimodal models outperform text-only models in incomplete data scenarios.
Adding images boosts model accuracy, even with limited textual information.
OpenFlamingo performs best with complete text, but multimodal models excel with missing data.
Abstract
Background: Large language models (LLMs) are gaining use in clinical settings, but their performance can suffer with incomplete radiology reports. We tested whether multimodal LLMs (using text and images) could improve accuracy and understanding in chest radiography reports, making them more effective for clinical decision support. Purpose: To assess the robustness of LLMs in generating accurate impressions from chest radiography reports using both incomplete data and multimodal data. Material and Methods: We used 300 radiology image-report pairs from the MIMIC-CXR database. Three LLMs (OpenFlamingo, MedFlamingo, IDEFICS) were tested in both text-only and multimodal formats. Impressions were first generated from the full text, then tested by removing 20%, 50%, and 80% of the text. The impact of adding images was evaluated using chest x-rays, and model performance was compared using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Topic Modeling · Computational and Text Analysis Methods
