Perceive and Calibrate: Analyzing and Enhancing Robustness of Medical Multi-Modal Large Language Models

Dunyuan XU; Xikai Yang; Yaoqian Li; Juzheng Miao; Jinpeng Li; Pheng-Ann Heng

arXiv:2512.21964·cs.CV·December 29, 2025

Perceive and Calibrate: Analyzing and Enhancing Robustness of Medical Multi-Modal Large Language Models

Dunyuan XU, Xikai Yang, Yaoqian Li, Juzheng Miao, Jinpeng Li, Pheng-Ann Heng

PDF

Open Access

TL;DR

This paper systematically analyzes the impact of various noise perturbations on medical multi-modal large language models and introduces a training-free calibration framework to enhance their robustness in clinical settings.

Contribution

It presents a novel, training-free Inherent-enhanced Multi-modal Calibration framework that improves robustness of medical MLLMs against real-world noise across visual and textual modalities.

Findings

01

Achieves state-of-the-art robustness performance across multiple modalities.

02

Constructed a comprehensive benchmark with 11 noise types on 2 datasets.

03

Demonstrates effectiveness of perception-and-calibration approach in clinical scenarios.

Abstract

Medical Multi-modal Large Language Models (MLLMs) have shown promising clinical performance. However, their sensitivity to real-world input perturbations, such as imaging artifacts and textual errors, critically undermines their clinical applicability. Systematic analysis of such noise impact on medical MLLMs remains largely unexplored. Furthermore, while several works have investigated the MLLMs' robustness in general domains, they primarily focus on text modality and rely on costly fine-tuning. They are inadequate to address the complex noise patterns and fulfill the strict safety standards in medicine. To bridge this gap, this work systematically analyzes the impact of various perturbations on medical MLLMs across both visual and textual modalities. Building on our findings, we introduce a training-free Inherent-enhanced Multi-modal Calibration (IMC) framework that leverages MLLMs'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI