Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical   Vision-Language Models

Songtao Jiang; Tuo Zheng; Yan Zhang; Yeying Jin; Li Yuan; Zuozhu; Liu

arXiv:2404.10237·cs.CV·September 4, 2024·1 cites

Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models

Songtao Jiang, Tuo Zheng, Yan Zhang, Yeying Jin, Li Yuan, Zuozhu, Liu

PDF

Open Access 2 Repos

TL;DR

Med-MoE introduces a lightweight, domain-specific mixture-of-experts framework for multimodal medical tasks, achieving high performance with fewer activated parameters, suitable for resource-constrained clinical settings.

Contribution

The paper presents a novel Med-MoE framework that efficiently handles both discriminative and generative medical tasks with fewer parameters, enhancing practicality in clinical applications.

Findings

01

Achieves superior or comparable performance to state-of-the-art models.

02

Requires only 30-50% of model parameters to be activated.

03

Demonstrates effectiveness across multiple medical datasets and tasks.

Abstract

Recent advancements in general-purpose or domain-specific multimodal large language models (LLMs) have witnessed remarkable progress for medical decision-making. However, they are designated for specific classification or generative tasks, and require model training or finetuning on large-scale datasets with sizeable parameters and tremendous computing, hindering their clinical utility across diverse resource-constrained scenarios in practice. In this paper, we propose a novel and lightweight framework Med-MoE (Mixture-of-Experts) that tackles both discriminative and generative multimodal medical tasks. The learning of Med-MoE consists of three steps: multimodal medical alignment, instruction tuning and routing, and domain-specific MoE tuning. After aligning multimodal medical images with LLM tokens, we then enable the model for different multimodal medical tasks with instruction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Multimodal Machine Learning Applications

MethodsMixture of Experts