MedFocusCLIP : Improving few shot classification in medical datasets   using pixel wise attention

Aadya Arora; Vinay Namboodiri

arXiv:2501.03839·eess.IV·January 8, 2025

MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention

Aadya Arora, Vinay Namboodiri

PDF

Open Access

TL;DR

This paper introduces MedFocusCLIP, a method that enhances few-shot medical image classification by guiding CLIP's attention to relevant regions using segmentation cues from SAM2, improving accuracy and interpretability.

Contribution

It proposes leveraging SAM2 segmentation as a visual prompt to improve CLIP's fine-grained medical image classification in few-shot settings.

Findings

01

Achieved higher accuracy on multiple medical datasets compared to baseline CLIP.

02

Enabled interpretable localization of discriminative regions in images.

03

Demonstrated effectiveness across X-ray, CT, and MRI datasets.

Abstract

With the popularity of foundational models, parameter efficient fine tuning has become the defacto approach to leverage pretrained models to perform downstream tasks. Taking inspiration from recent advances in large language models, Visual Prompt Tuning, and similar techniques, learn an additional prompt to efficiently finetune a pretrained vision foundational model. However, we observe that such prompting is insufficient for fine-grained visual classification tasks such as medical image classification, where there is large inter-class variance, and small intra-class variance. Hence, in this paper we propose to leverage advanced segmentation capabilities of Segment Anything Model 2 (SAM2) as a visual prompting cue to help visual encoder in the CLIP (Contrastive Language-Image Pretraining) by guiding the attention in CLIP visual encoder to relevant regions in the image. This helps the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Artificial Intelligence in Healthcare · Radiomics and Machine Learning in Medical Imaging

MethodsSoftmax · Attention Is All You Need · Contrastive Language-Image Pre-training · Focus