Demonstrations of Integrity Attacks in Multi-Agent Systems

Can Zheng; Yuhan Cao; Xiaoning Dong; Tianxing He

arXiv:2506.04572·cs.CL·June 6, 2025

Demonstrations of Integrity Attacks in Multi-Agent Systems

Can Zheng, Yuhan Cao, Xiaoning Dong, Tianxing He

PDF

Open Access

TL;DR

This paper investigates how malicious agents can subtly manipulate multi-agent systems through prompt attacks, revealing vulnerabilities in current detection methods and emphasizing the need for more secure MAS architectures.

Contribution

The study introduces four novel prompt-based attack strategies against multi-agent systems and demonstrates their effectiveness in misleading evaluations and bypassing advanced monitors.

Findings

01

Strategic prompt manipulation can bias MAS behavior systematically.

02

Current LLM-based monitors can be bypassed by crafted attacks.

03

Malicious agents can manipulate evaluations and collaboration in MAS.

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language understanding, code generation, and complex planning. Simultaneously, Multi-Agent Systems (MAS) have garnered attention for their potential to enable cooperation among distributed agents. However, from a multi-party perspective, MAS could be vulnerable to malicious agents that exploit the system to serve self-interests without disrupting its core functionality. This work explores integrity attacks where malicious agents employ subtle prompt manipulation to bias MAS operations and gain various benefits. Four types of attacks are examined: \textit{Scapegoater}, who misleads the system monitor to underestimate other agents' contributions; \textit{Boaster}, who misleads the system monitor to overestimate their own performance; \textit{Self-Dealer}, who manipulates other agents to adopt certain tools;…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Explainable Artificial Intelligence (XAI)

MethodsSoftmax · Attention Is All You Need · ADaptive gradient method with the OPTimal convergence rate · Mixing Adam and SGD