vMFCoOp: Towards Equilibrium on a Unified Hyperspherical Manifold for Prompting Biomedical VLMs
Minye Shao, Sihan Guo, Xinrun Li, Xingyu Miao, Haoran Duan, Yang Long

TL;DR
vMFCoOp introduces a hyperspherical manifold approach to improve semantic alignment in biomedical vision-language models, enhancing prompt robustness and few-shot learning across diverse medical datasets.
Contribution
It proposes a novel vMF distribution-based framework on a hyperspherical manifold for better semantic alignment in biomedical VLM prompting.
Findings
Outperforms state-of-the-art methods in accuracy and generalization
Demonstrates robustness across 14 medical datasets and 12 imaging modalities
Enhances few-shot classification and clinical applicability
Abstract
Recent advances in context optimization (CoOp) guided by large language model (LLM)-distilled medical semantic priors offer a scalable alternative to manual prompt engineering and full fine-tuning for adapting biomedical CLIP-based vision-language models (VLMs). However, prompt learning in this context is challenged by semantic misalignment between LLMs and CLIP variants due to divergent training corpora and model architectures; it further lacks scalability across continuously evolving families of foundation models. More critically, pairwise multimodal alignment via conventional Euclidean-space optimization lacks the capacity to model unified representations or apply localized geometric constraints, which tends to amplify modality gaps in complex biomedical imaging and destabilize few-shot adaptation. In this work, we propose vMFCoOp, a framework that inversely estimates von…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare
