Comparative analysis of privacy-preserving open-source LLMs regarding extraction of diagnostic information from clinical CMR imaging reports
Sina Amirrajab, Volker Vehof, Michael Bietenbeck, Ali Yilmaz

TL;DR
This study evaluates open-source, privacy-preserving LLMs for extracting diagnostic info from CMR reports, showing they can outperform cardiologists in classification accuracy, supporting their clinical utility.
Contribution
It provides a comprehensive comparison of nine open-source LLMs for clinical report classification, highlighting top models that outperform a board-certified cardiologist.
Findings
Top LLMs achieved F1 scores above 0.95.
Most models outperformed the cardiologist in classification accuracy.
Open-source LLMs are feasible for clinical report analysis.
Abstract
Purpose: We investigated the utilization of privacy-preserving, locally-deployed, open-source Large Language Models (LLMs) to extract diagnostic information from free-text cardiovascular magnetic resonance (CMR) reports. Materials and Methods: We evaluated nine open-source LLMs on their ability to identify diagnoses and classify patients into various cardiac diagnostic categories based on descriptive findings in 109 clinical CMR reports. Performance was quantified using standard classification metrics including accuracy, precision, recall, and F1 score. We also employed confusion matrices to examine patterns of misclassification across models. Results: Most open-source LLMs demonstrated exceptional performance in classifying reports into different diagnostic categories. Google's Gemma2 model achieved the highest average F1 score of 0.98, followed by Qwen2.5:32B and DeepseekR1-32B with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Artificial Intelligence in Healthcare and Education
