Retrieval-augmented in-context learning for multimodal large language models in disease classification
Zaifu Zhan, Shuang Zhou, Xiaoshan Zhou, Yongkang Xiao, Jun Wang,, Jiawen Deng, He Zhu, Yu Hou, Rui Zhang

TL;DR
This paper introduces RAICL, a retrieval-augmented in-context learning framework that dynamically selects similar disease demonstrations to improve multimodal large language model performance in disease classification tasks.
Contribution
The paper presents a novel RAICL framework combining retrieval-augmented generation and in-context learning for adaptive demonstration selection in multimodal disease classification.
Findings
RAICL improved classification accuracy on TCGA and IU Chest X-ray datasets.
Multi-modal inputs outperform single-modal inputs, with text-only inputs being more effective than images alone.
Increasing the number of retrieved examples enhances model performance.
Abstract
Objectives: We aim to dynamically retrieve informative demonstrations, enhancing in-context learning in multimodal large language models (MLLMs) for disease classification. Methods: We propose a Retrieval-Augmented In-Context Learning (RAICL) framework, which integrates retrieval-augmented generation (RAG) and in-context learning (ICL) to adaptively select demonstrations with similar disease patterns, enabling more effective ICL in MLLMs. Specifically, RAICL examines embeddings from diverse encoders, including ResNet, BERT, BioBERT, and ClinicalBERT, to retrieve appropriate demonstrations, and constructs conversational prompts optimized for ICL. We evaluated the framework on two real-world multi-modal datasets (TCGA and IU Chest X-ray), assessing its performance across multiple MLLMs (Qwen, Llava, Gemma), embedding strategies, similarity metrics, and varying numbers of demonstrations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare · Text and Document Classification Technologies
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Average Pooling · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Attention Dropout · Softmax · Residual Connection · WordPiece
