In-context learning enables multimodal large language models to classify cancer pathology images
Dyke Ferber, Georg W\"olflein, Isabella C. Wiest, Marta Ligero,, Srividhya Sainath, Narmin Ghaffari Laleh, Omar S.M. El Nahhas, Gustav, M\"uller-Franzes, Dirk J\"ager, Daniel Truhn, Jakob Nikolas Kather

TL;DR
This paper demonstrates that large vision-language models like GPT-4V can effectively perform medical image classification tasks in histopathology through in-context learning, reducing the need for extensive labeled datasets and specialized training.
Contribution
It shows that GPT-4V can be used out-of-the-box for cancer histopathology classification tasks via in-context learning, outperforming some specialized models with minimal samples.
Findings
In-context learning matches or surpasses specialized neural networks.
GPT-4V requires few samples for effective classification.
Medical image analysis can leverage generalist AI models.
Abstract
Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · AI in cancer detection
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Layer Normalization · Absolute Position Encodings · Dropout · Softmax · Residual Connection
