Retrieval-augmented in-context learning for multimodal large language   models in disease classification

Zaifu Zhan; Shuang Zhou; Xiaoshan Zhou; Yongkang Xiao; Jun Wang,; Jiawen Deng; He Zhu; Yu Hou; Rui Zhang

arXiv:2505.02087·cs.AI·May 6, 2025

Retrieval-augmented in-context learning for multimodal large language models in disease classification

Zaifu Zhan, Shuang Zhou, Xiaoshan Zhou, Yongkang Xiao, Jun Wang,, Jiawen Deng, He Zhu, Yu Hou, Rui Zhang

PDF

Open Access 1 Models

TL;DR

This paper introduces RAICL, a retrieval-augmented in-context learning framework that dynamically selects similar disease demonstrations to improve multimodal large language model performance in disease classification tasks.

Contribution

The paper presents a novel RAICL framework combining retrieval-augmented generation and in-context learning for adaptive demonstration selection in multimodal disease classification.

Findings

01

RAICL improved classification accuracy on TCGA and IU Chest X-ray datasets.

02

Multi-modal inputs outperform single-modal inputs, with text-only inputs being more effective than images alone.

03

Increasing the number of retrieved examples enhances model performance.

Abstract

Objectives: We aim to dynamically retrieve informative demonstrations, enhancing in-context learning in multimodal large language models (MLLMs) for disease classification. Methods: We propose a Retrieval-Augmented In-Context Learning (RAICL) framework, which integrates retrieval-augmented generation (RAG) and in-context learning (ICL) to adaptively select demonstrations with similar disease patterns, enabling more effective ICL in MLLMs. Specifically, RAICL examines embeddings from diverse encoders, including ResNet, BERT, BioBERT, and ClinicalBERT, to retrieve appropriate demonstrations, and constructs conversational prompts optimized for ICL. We evaluated the framework on two real-world multi-modal datasets (TCGA and IU Chest X-ray), assessing its performance across multiple MLLMs (Qwen, Llava, Gemma), embedding strategies, similarity metrics, and varying numbers of demonstrations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Learn4everrr/Tuned_bioBERT
model· 1 dl
1 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare · Text and Document Classification Technologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Average Pooling · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Attention Dropout · Softmax · Residual Connection · WordPiece