MI-VisionShot: Few-shot adaptation of vision-language models for slide-level classification of histopathological images
Pablo Meseguer, Roc\'io del Amor, Valery Naranjo

TL;DR
MI-VisionShot is a training-free, prototype-based method that enhances vision-language models for accurate, low-variability slide-level classification in histopathology with few-shot learning.
Contribution
It introduces a novel, training-free adaptation technique leveraging prototypes for slide-level classification in digital pathology.
Findings
Outperforms zero-shot transfer in low-shot scenarios
Reduces variability in predictions compared to existing methods
Effective in few-shot histopathological image classification
Abstract
Vision-language supervision has made remarkable strides in learning visual representations from textual guidance. In digital pathology, vision-language models (VLM), pre-trained on curated datasets of histological image-captions, have been adapted to downstream tasks, such as region of interest classification. Zero-shot transfer for slide-level prediction has been formulated by MI-Zero, but it exhibits high variability depending on the textual prompts. Inspired by prototypical learning, we propose MI-VisionShot, a training-free adaptation method on top of VLMs to predict slide-level labels in few-shot learning scenarios. Our framework takes advantage of the excellent representation learning of VLM to create prototype-based classifiers under a multiple-instance setting by retrieving the most discriminative patches within each slide. Experimentation through different settings shows the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · COVID-19 diagnosis using AI · Colorectal Cancer Screening and Detection
