MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration
Yucheng Zhou, Lingran Song, Jianbing Shen

TL;DR
MAM is a modular multi-agent framework that enhances multi-modal medical diagnosis by role specialization, enabling efficient knowledge updates and outperforming existing models across diverse datasets.
Contribution
We propose MAM, a novel modular multi-agent system with role-specific LLMs for improved multi-modal medical diagnosis and flexible knowledge management.
Findings
MAM outperforms modality-specific LLMs on various datasets.
Achieves 18% to 365% performance improvements over baselines.
Effective in handling text, image, audio, and video modalities.
Abstract
Recent advancements in medical Large Language Models (LLMs) have showcased their powerful reasoning and diagnostic capabilities. Despite their success, current unified multimodal medical LLMs face limitations in knowledge update costs, comprehensiveness, and flexibility. To address these challenges, we introduce the Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis (MAM). Inspired by our empirical findings highlighting the benefits of role assignment and diagnostic discernment in LLMs, MAM decomposes the medical diagnostic process into specialized roles: a General Practitioner, Specialist Team, Radiologist, Medical Assistant, and Director, each embodied by an LLM-based agent. This modular and collaborative framework enables efficient knowledge updates and leverages existing medical LLMs and knowledge bases. Extensive experimental evaluations conducted on a wide range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation
