Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

Adeela Bashir; The Anh han; Zia Ush Shamszaman

arXiv:2512.03097·cs.CR·December 4, 2025

Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

Adeela Bashir, The Anh han, Zia Ush Shamszaman

PDF

Open Access

TL;DR

This paper reveals the risks of multi-agent collusion in AI healthcare systems, demonstrating how adversaries can manipulate decisions, and proposes a verifier-based defense to ensure adherence to clinical guidelines.

Contribution

It introduces an experimental framework to study collusion in multi-agent AI healthcare and presents a lightweight method to prevent harmful consensus.

Findings

01

Collusion can lead to 100% attack success and harmful recommendations in unprotected systems.

02

A verifier agent can restore 100% decision accuracy by blocking adversarial consensus.

03

First systematic evidence of collusion risks in AI healthcare systems.

Abstract

The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also deployed as multi-agent teams to assist AI doctors by debating, voting, or advising on decisions. However, when multiple assistant agents interact, coordinated adversaries can collude to create false consensus, pushing an AI doctor toward harmful prescriptions. We develop an experimental framework with scripted and unscripted doctor agents, adversarial assistants, and a verifier agent that checks decisions against clinical guidelines. Using 50 representative clinical questions, we find that collusion drives the Attack Success Rate (ASR) and Harmful Recommendation Rates (HRR) up to 100% in unprotected systems. In contrast, the verifier agent restores 100% accuracy by blocking adversarial consensus. This work provides the first systematic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education