Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning

Zhuofan Xie; Zishan Lin; Jinliang Lin; Jie Qi; Shaohua Hong; Shuo Li

arXiv:2602.18867·cs.CV·March 12, 2026

Similarity-as-Evidence: Calibrating Overconfident VLMs for Interpretable and Label-Efficient Medical Active Learning

Zhuofan Xie, Zishan Lin, Jinliang Lin, Jie Qi, Shaohua Hong, Shuo Li

PDF

Open Access

TL;DR

This paper introduces SaE, a framework that calibrates similarity scores in vision-language models for medical active learning, improving interpretability and reducing overconfidence to enhance sample selection and annotation efficiency.

Contribution

SaE reinterprets similarity scores as evidence using a Dirichlet distribution, enabling better calibration and more effective, interpretable active learning in medical imaging.

Findings

01

Achieves state-of-the-art accuracy of 82.57% on ten datasets.

02

Improves calibration with NLL of 0.425 on BTMRI.

03

Effectively prioritizes rare and ambiguous cases during active learning.

Abstract

Active Learning (AL) reduces annotation costs in medical imaging by selecting only the most informative samples for labeling, but suffers from cold-start when labeled data are scarce. Vision-Language Models (VLMs) address the cold-start problem via zero-shot predictions, yet their temperature-scaled softmax outputs treat text-image similarities as deterministic scores while ignoring inherent uncertainty, leading to overconfidence. This overconfidence misleads sample selection, wasting annotation budgets on uninformative cases. To overcome these limitations, the Similarity-as-Evidence (SaE) framework calibrates text-image similarities by introducing a Similarity Evidence Head (SEH), which reinterprets the similarity vector as evidence and parameterizes a Dirichlet distribution over labels. In contrast to a standard softmax that enforces confident predictions even under weak signals, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms