Adversarial Attack on Black-Box Multi-Agent by Adaptive Perturbation

Jianming Chen; Yawen Wang; Junjie Wang; Xiaofei Xie; Yuanzhe Hu; Qing Wang; Fanjiang Xu

arXiv:2511.15292·cs.MA·April 29, 2026

Adversarial Attack on Black-Box Multi-Agent by Adaptive Perturbation

Jianming Chen, Yawen Wang, Junjie Wang, Xiaofei Xie, Yuanzhe Hu, Qing Wang, Fanjiang Xu

PDF

1 Video

TL;DR

This paper introduces AdapAM, a novel black-box attack framework for multi-agent systems that balances effectiveness and stealthiness using adaptive victim selection and proxy-based perturbations.

Contribution

AdapAM is the first framework to combine adaptive victim selection with proxy-based perturbation for stealthy black-box attacks on multi-agent systems.

Findings

01

AdapAM outperforms four baselines across eight environments.

02

Perturbations generated by AdapAM are less noisy and more stealthy.

03

AdapAM achieves the best attack performance at various perturbation rates.

Abstract

Evaluating security and reliability for multi-agent systems (MAS) is urgent as they become increasingly prevalent in various applications. As an evaluation technique, existing adversarial attack frameworks face certain limitations, e.g., impracticality due to the requirement of white-box information or high control authority, and a lack of stealthiness or effectiveness as they often target all agents or specific fixed agents. To address these issues, we propose AdapAM, a novel framework for adversarial attacks on black-box MAS. AdapAM incorporates two key components: (1) Adaptive Selection Policy simultaneously selects the victim and determines the anticipated malicious action (the action would lead to the worst impact on MAS), balancing effectiveness and stealthiness. (2) Proxy-based Perturbation to Induce Malicious Action utilizes generative adversarial imitation learning to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Adversarial Attack on Black-Box Multi-Agent by Adaptive Perturbation· underline