Don't Trust Your Upstream: Exploiting LLM Multi-Agent System via Topology-Guided Adversarial Propagation
Ruichao Liang, Le Yin, Jing Chen, Yebo Feng, Cong Wu, Xiaoyu Zhang, Huangpeng Gu, Zijian Zhang, Yang Liu

TL;DR
This paper introduces a topology-aware adversarial attack method on LLM-based multi-agent systems, revealing significant vulnerabilities and proposing a mitigation strategy to enhance security.
Contribution
It presents a novel topology-guided attack scheme that propagates adversarial contamination through MASs, demonstrating practical black-box attack success and proposing a mitigation approach.
Findings
Achieves 40-78% success rate on three MAS frameworks
Reaches 85% success on two real-world MAS applications
Proposes a topology-trust mitigation blocking 94.8% of attacks
Abstract
The digital world is witnessing the rapid rise of LLM-based multi-agent systems (MASs) and their powerful applications. However, their security remains insufficiently understood, as existing evaluations are largely limited to narrow attack settings and may substantially underestimate the real risks of MAS deployments. Inspired by the MAS inter-agent dependencies, where upstream outputs are reinterpreted and executed by downstream agents, we propose a topology-aware attack scheme that propagates adversarial contamination from exposed edge agents to high-privilege agents to induce malicious behaviors. By combining topology reconnaissance, contamination propagation modeling, and hierarchical payload encapsulation, our approach overcomes the key challenges of black-box attacks and makes such multi-hop compromise practical. Experiments show that our approach achieves success rates of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
