Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making

Kaitao Chen; Mianxin Liu; Daoming Zong; Chaoyue Ding; Shaohao Rui; Yankai Jiang; Mu Zhou; Xiaosong Wang

arXiv:2508.05996·cs.AI·October 14, 2025

Mediator-Guided Multi-Agent Collaboration among Open-Source Models for Medical Decision-Making

Kaitao Chen, Mianxin Liu, Daoming Zong, Chaoyue Ding, Shaohao Rui, Yankai Jiang, Mu Zhou, Xiaosong Wang

PDF

TL;DR

This paper introduces MedOrch, a mediator-guided multi-agent framework that enhances medical decision-making by enabling diverse vision-language models to collaborate via an LLM mediator, improving accuracy without additional training.

Contribution

The paper presents MedOrch, a novel framework that leverages an LLM mediator to facilitate effective collaboration among heterogeneous open-source VLMs for medical multimodal tasks.

Findings

01

Collaboration among VLM agents surpasses individual performance.

02

The approach achieves superior results on five medical VQA benchmarks.

03

No additional model training is required for improved performance.

Abstract

Complex medical decision-making involves cooperative workflows operated by different clinicians. Designing AI multi-agent systems can expedite and augment human-level clinical decision-making. Existing multi-agent researches primarily focus on language-only tasks, yet their extension to multimodal scenarios remains challenging. A blind combination of diverse vision-language models (VLMs) can amplify an erroneous outcome interpretation. VLMs in general are less capable in instruction following and importantly self-reflection, compared to large language models (LLMs) of comparable sizes. This disparity largely constrains VLMs' ability in cooperative workflows. In this study, we propose MedOrch, a mediator-guided multi-agent collaboration framework for medical multimodal decision-making. MedOrch employs an LLM-based mediator agent that enables multiple VLM-based expert agents to exchange…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.