Constrained Black-Box Attacks Against Cooperative Multi-Agent Reinforcement Learning

Amine Andam; Jamal Bentahar; Mustapha Hedabou

arXiv:2508.09275·cs.LG·January 22, 2026

Constrained Black-Box Attacks Against Cooperative Multi-Agent Reinforcement Learning

Amine Andam, Jamal Bentahar, Mustapha Hedabou

PDF

TL;DR

This paper explores new vulnerabilities in cooperative multi-agent reinforcement learning by developing constrained black-box attacks that perturb agent observations, demonstrating effectiveness and sample efficiency across multiple benchmarks.

Contribution

It introduces a novel black-box attack method under realistic constraints, focusing on observation perturbations without access to policies or training data.

Findings

01

Effective attack across diverse algorithms and environments

02

Sample-efficient with only 1,000 samples needed

03

Vulnerabilities exist even with limited adversarial access

Abstract

Collaborative multi-agent reinforcement learning has rapidly evolved, offering state-of-the-art algorithms for real-world applications, including sensitive domains. However, a key challenge to its widespread adoption is the lack of a thorough investigation into its vulnerabilities to adversarial attacks. Existing work predominantly focuses on training-time attacks or unrealistic scenarios, such as access to policy weights or the ability to train surrogate policies. In this paper, we investigate new vulnerabilities under more challenging and constrained conditions, assuming an adversary can only collect and perturb the observations of deployed agents. We also consider scenarios where the adversary has no access at all (no observations, actions, or weights). Our main approach is to generate perturbations that intentionally misalign how victim agents see their environment. Our approach is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.