Capabilities of Gemini Models in Medicine
Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery, Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma, Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng,, David G.T. Barrett, Cathy Cheung, Basil Mustafa

TL;DR
Med-Gemini, a specialized multimodal AI model for medicine, achieves state-of-the-art results on 14 benchmarks, surpassing GPT-4 and human experts in various medical tasks, demonstrating strong reasoning and multimodal capabilities.
Contribution
Introduction of Med-Gemini, a novel multimodal medical AI model with web search and custom encoder capabilities, achieving new state-of-the-art performance on multiple benchmarks.
Findings
Achieves 91.1% accuracy on USMLE with uncertainty-guided search.
Surpasses GPT-4V on 7 multimodal benchmarks by 44.5% relative margin.
Outperforms prior methods in long-context medical reasoning tasks.
Abstract
Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-Gemini, a family of highly capable multimodal models that are specialized in medicine with the ability to seamlessly use web search, and that can be efficiently tailored to novel modalities using custom encoders. We evaluate Med-Gemini on 14 medical benchmarks, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpass the GPT-4 model family on every benchmark where a direct comparison is viable, often by a wide margin. On the popular MedQA (USMLE) benchmark, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Mathematical Biology Tumor Growth
MethodsAttention Is All You Need · Dropout · Softmax · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Absolute Position Encodings · Linear Layer · Dense Connections · Label Smoothing · Residual Connection
