Does medical specialization of VLMs enhance discriminative power?: A comprehensive investigation through feature distribution analysis
Keita Takeda, Tomoya Sakai

TL;DR
This paper analyzes the feature representations of medical vision-language models (VLMs), revealing that non-medical models with recent contextual enhancements can produce more refined, discriminative features, emphasizing the importance of text encoder quality over medical image training.
Contribution
It provides a comprehensive analysis of feature distributions in medical VLMs, comparing medical and non-medical models, and highlights the significance of text encoder improvements for discriminative power.
Findings
Medical VLMs can extract effective discriminative features for classification.
Recent non-medical VLMs with contextual enrichment outperform medical VLMs in feature refinement.
Models are vulnerable to biases from overlaid textual information in images.
Abstract
This study investigates the feature representations produced by publicly available open source medical vision-language models (VLMs). While medical VLMs are expected to capture diagnostically relevant features, their learned representations remain underexplored, and standard evaluations like classification accuracy do not fully reveal if they acquire truly discriminative, lesion-specific features. Understanding these representations is crucial for revealing medical image structures and improving downstream tasks in medical image analysis. This study aims to investigate the feature distributions learned by medical VLMs and evaluate the impact of medical specialization. We analyze the feature distribution of multiple image modalities extracted by some representative medical VLMs across lesion classification datasets on multiple modalities. These distributions were compared them with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI
