Few-shot Adaptation of Medical Vision-Language Models

Fereshteh Shakeri; Yunshi Huang; Julio Silva-Rodr\'iguez; Houda Bahig,; An Tang; Jose Dolz; Ismail Ben Ayed

arXiv:2409.03868·cs.CV·September 9, 2024

Few-shot Adaptation of Medical Vision-Language Models

Fereshteh Shakeri, Yunshi Huang, Julio Silva-Rodr\'iguez, Houda Bahig,, An Tang, Jose Dolz, Ismail Ben Ayed

PDF

Open Access 1 Repo

TL;DR

This paper introduces the first structured benchmark for few-shot adaptation of medical vision-language models, evaluating simple and complex strategies across multiple modalities and tasks, revealing competitive performance of a linear-probe baseline.

Contribution

It presents a novel benchmark for few-shot medical VLM adaptation and demonstrates that a simple linear-probe method can be highly effective and efficient.

Findings

01

Linear-probe baseline performs competitively with complex methods.

02

The benchmark covers three medical modalities and nine tasks.

03

The code and benchmark are publicly available for further research.

Abstract

Integrating image and text data through multi-modal learning has emerged as a new approach in medical imaging research, following its successful deployment in computer vision. While considerable efforts have been dedicated to establishing medical foundation models and their zero-shot transfer to downstream tasks, the popular few-shot setting remains relatively unexplored. Following on from the currently strong emergence of this setting in computer vision, we introduce the first structured benchmark for adapting medical vision-language models (VLMs) in a strict few-shot regime and investigate various adaptation strategies commonly used in the context of natural images. Furthermore, we evaluate a simple generalization of the linear-probe adaptation baseline, which seeks an optimal blending of the visual prototypes and text embeddings via learnable class-wise multipliers. Surprisingly,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fereshteshakeri/few-shot-medvlms
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications