Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning
Zishan Gu, Fenglin Liu, Changchang Yin, Ping Zhang

TL;DR
This paper introduces MultiMedRes, a proactive multimodal medical reasoning framework that enables LLMs to decompose, interact with expert models, and integrate knowledge for improved zero-shot medical reasoning, especially in X-ray visual question answering.
Contribution
It presents a novel three-step framework for proactive multimodal reasoning in healthcare, enhancing LLMs with domain-specific expert interactions for zero-shot medical tasks.
Findings
Achieves state-of-the-art zero-shot performance on X-ray VQA.
Outperforms fully supervised methods in medical reasoning tasks.
Can be integrated into various LLMs to improve multimodal medical reasoning.
Abstract
The adoption of large language models (LLMs) in healthcare has attracted significant research interest. However, their performance in healthcare remains under-investigated and potentially limited, due to i) they lack rich domain-specific knowledge and medical reasoning skills; and ii) most state-of-the-art LLMs are unimodal, text-only models that cannot directly process multimodal inputs. To this end, we propose a multimodal medical collaborative reasoning framework \textbf{MultiMedRes}, which incorporates a learner agent to proactively gain essential information from domain-specific expert models, to solve medical multimodal reasoning problems. Our method includes three steps: i) \textbf{Inquire}: The learner agent first decomposes given complex medical reasoning problems into multiple domain-specific sub-problems; ii) \textbf{Interact}: The agent then interacts with domain-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Topic Modeling
