Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble
Amit Kumthekar, Zion Tilley, Henry Duong, Bhargav Patel, Michael Magnoli, Ahmed Omar, Ahmed Nasser, Chaitanya Gharpure, Yevgen Reztzov

TL;DR
This paper introduces the Consensus Mechanism, an ensemble of specialized medical expert models, to improve clinical decision-making with adaptive, robust, and cost-effective AI systems evaluated across multiple medical benchmarks.
Contribution
The paper proposes a novel ensemble framework mimicking clinical decision processes, enhancing adaptability and accuracy over single-model approaches in medical AI applications.
Findings
Achieved 61.0% accuracy on MedXpertQA, outperforming O3 and Gemini 2.5 Pro.
Improved accuracy on MedQA and MedMCQA benchmarks by 3.4% and 9.1%.
Enhanced differential diagnosis recall, precision, and top-1 accuracy.
Abstract
Despite the growing clinical adoption of large language models (LLMs), current approaches heavily rely on single model architectures. To overcome risks of obsolescence and rigid dependence on single model systems, we present a novel framework, termed the Consensus Mechanism. Mimicking clinical triage and multidisciplinary clinical decision-making, the Consensus Mechanism implements an ensemble of specialized medical expert agents enabling improved clinical decision making while maintaining robust adaptability. This architecture enables the Consensus Mechanism to be optimized for cost, latency, or performance, purely based on its interior model configuration. To rigorously evaluate the Consensus Mechanism, we employed three medical evaluation benchmarks: MedMCQA, MedQA, and MedXpertQA Text, and the differential diagnosis dataset, DDX+. On MedXpertQA, the Consensus Mechanism achieved an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Topic Modeling
