Med-Flamingo: a Multimodal Medical Few-shot Learner
Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, Cyril Zakka,, Yash Dalmia, Eduardo Pontes Reis, Pranav Rajpurkar, Jure Leskovec

TL;DR
Med-Flamingo is a multimodal medical few-shot learner based on OpenFlamingo-9B, capable of generating medical visual question answers with minimal data, evaluated through human assessments and outperforming previous models.
Contribution
The paper introduces Med-Flamingo, a novel medical multimodal few-shot learning model that adapts pre-trained vision-language models to the medical domain with minimal data.
Findings
Up to 20% improvement in clinician-rated performance.
First human evaluation of generative medical VQA.
Enables multimodal medical few-shot tasks like rationale generation.
Abstract
Medicine, by its nature, is a multifaceted domain that requires the synthesis of information across various modalities. Medical generative vision-language models (VLMs) make a first step in this direction and promise many exciting clinical applications. However, existing models typically have to be fine-tuned on sizeable down-stream datasets, which poses a significant limitation as in many medical applications data is scarce, necessitating models that are capable of learning from few examples in real-time. Here we propose Med-Flamingo, a multimodal few-shot learner adapted to the medical domain. Based on OpenFlamingo-9B, we continue pre-training on paired and interleaved medical image-text data from publications and textbooks. Med-Flamingo unlocks few-shot generative medical visual question answering (VQA) abilities, which we evaluate on several datasets including a novel challenging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Digital Storytelling and Education
