Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models
Xiao Liang, Di Wang, Zhicheng Jiao, Ronghan Li, Pengfei Yang, Quan Wang, Tat-Seng Chua

TL;DR
This paper introduces an expert-in-the-loop framework called Expert-CFG that improves medical vision-language models by estimating uncertainty, retrieving relevant references, and guiding outputs without additional training, enhancing reliability in clinical applications.
Contribution
The proposed Expert-CFG framework aligns MedVLM with clinical expertise using uncertainty estimation and reference retrieval, avoiding costly retraining and improving model reliability.
Findings
Outperforms state-of-the-art models with fewer parameters.
Effective in resource-limited clinical settings.
Demonstrates improved accuracy on medical visual question answering benchmarks.
Abstract
The rapid advancements in Vision Language Models (VLMs) have prompted the development of multi-modal medical assistant systems. Despite this progress, current models still have inherent probabilistic uncertainties, often producing erroneous or unverified responses-an issue with serious implications in medical applications. Existing methods aim to enhance the performance of Medical Vision Language Model (MedVLM) by adjusting model structure, fine-tuning with high-quality data, or through preference fine-tuning. However, these training-dependent strategies are costly and still lack sufficient alignment with clinical expertise. To address these issues, we propose an expert-in-the-loop framework named Expert-Controlled Classifier-Free Guidance (Expert-CFG) to align MedVLM with clinical expertise without additional training. This framework introduces an uncertainty estimation strategy to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · AI-based Problem Solving and Planning · Biomedical Text Mining and Ontologies
