MoME: Mixture of Visual Language Medical Experts for Medical Imaging Segmentation

Arghavan Rezvani; Xiangyi Yan; Anthony T. Wu; Kun Han; Pooya Khosravi; Xiaohui Xie

arXiv:2510.26996·cs.CV·November 3, 2025

MoME: Mixture of Visual Language Medical Experts for Medical Imaging Segmentation

Arghavan Rezvani, Xiangyi Yan, Anthony T. Wu, Kun Han, Pooya Khosravi, Xiaohui Xie

PDF

Open Access

TL;DR

MoME introduces a novel mixture of visual language experts architecture that leverages multi-scale visual features and textual embeddings to improve medical image segmentation across diverse datasets.

Contribution

This work adapts the Mixture of Experts paradigm from language models to medical vision-language tasks, integrating foundation models for enhanced segmentation performance.

Findings

01

Strong performance on 10 datasets including 3,410 CT scans

02

Competitive precision across multiple medical imaging benchmarks

03

Effective integration of vision-language models for medical segmentation

Abstract

In this study, we propose MoME, a Mixture of Visual Language Medical Experts, for Medical Image Segmentation. MoME adapts the successful Mixture of Experts (MoE) paradigm, widely used in Large Language Models (LLMs), for medical vision-language tasks. The architecture enables dynamic expert selection by effectively utilizing multi-scale visual features tailored to the intricacies of medical imagery, enriched with textual embeddings. This work explores a novel integration of vision-language models for this domain. Utilizing an assembly of 10 datasets, encompassing 3,410 CT scans, MoME demonstrates strong performance on a comprehensive medical imaging segmentation benchmark. Our approach explores the integration of foundation models for medical imaging, benefiting from the established efficacy of MoE in boosting model performance by incorporating textual information. Demonstrating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Healthcare and Education