Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

Anindya Bijoy Das; Shahnewaz Karim Sakib; Shibbir Ahmed

arXiv:2508.07031·eess.IV·August 12, 2025

Trustworthy Medical Imaging with Large Language Models: A Study of Hallucinations Across Modalities

Anindya Bijoy Das, Shahnewaz Karim Sakib, Shibbir Ahmed

PDF

Open Access

TL;DR

This paper investigates hallucinations in large language models applied to medical imaging, analyzing errors in image-to-text and text-to-image tasks to improve clinical reliability and safety.

Contribution

It systematically studies hallucination patterns in medical imaging LLMs, providing insights into their causes and implications for enhancing trustworthiness.

Findings

01

Common hallucination patterns identified across modalities

02

Errors include factual inconsistencies and anatomical inaccuracies

03

Implications for improving model reliability in clinical settings

Abstract

Large Language Models (LLMs) are increasingly applied to medical imaging tasks, including image interpretation and synthetic image generation. However, these models often produce hallucinations, which are confident but incorrect outputs that can mislead clinical decisions. This study examines hallucinations in two directions: image to text, where LLMs generate reports from X-ray, CT, or MRI scans, and text to image, where models create medical images from clinical prompts. We analyze errors such as factual inconsistencies and anatomical inaccuracies, evaluating outputs using expert informed criteria across imaging modalities. Our findings reveal common patterns of hallucination in both interpretive and generative tasks, with implications for clinical reliability. We also discuss factors contributing to these failures, including model architecture and training data. By systematically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Artificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI