MoME: Mixture of Visual Language Medical Experts for Medical Imaging Segmentation
Arghavan Rezvani, Xiangyi Yan, Anthony T. Wu, Kun Han, Pooya Khosravi, Xiaohui Xie

TL;DR
MoME introduces a novel mixture of visual language experts architecture that leverages multi-scale visual features and textual embeddings to improve medical image segmentation across diverse datasets.
Contribution
This work adapts the Mixture of Experts paradigm from language models to medical vision-language tasks, integrating foundation models for enhanced segmentation performance.
Findings
Strong performance on 10 datasets including 3,410 CT scans
Competitive precision across multiple medical imaging benchmarks
Effective integration of vision-language models for medical segmentation
Abstract
In this study, we propose MoME, a Mixture of Visual Language Medical Experts, for Medical Image Segmentation. MoME adapts the successful Mixture of Experts (MoE) paradigm, widely used in Large Language Models (LLMs), for medical vision-language tasks. The architecture enables dynamic expert selection by effectively utilizing multi-scale visual features tailored to the intricacies of medical imagery, enriched with textual embeddings. This work explores a novel integration of vision-language models for this domain. Utilizing an assembly of 10 datasets, encompassing 3,410 CT scans, MoME demonstrates strong performance on a comprehensive medical imaging segmentation benchmark. Our approach explores the integration of foundation models for medical imaging, benefiting from the established efficacy of MoE in boosting model performance by incorporating textual information. Demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education
