Adversarial attacks in consensus-based multi-agent reinforcement learning
Martin Figura, Krishna Chaitanya Kosaraju, and Vijay Gupta

TL;DR
This paper investigates how adversarial attacks can manipulate consensus-based multi-agent reinforcement learning systems, revealing their vulnerability to malicious agents influencing collective policies.
Contribution
It demonstrates that adversarial agents can successfully persuade all agents in a consensus-based MARL network to adopt harmful policies, exposing a critical security flaw.
Findings
Adversarial agents can influence entire networks to adopt malicious policies
Consensus-based MARL algorithms are vulnerable to adversarial manipulation
Standard algorithms lack robustness against targeted attacks
Abstract
Recently, many cooperative distributed multi-agent reinforcement learning (MARL) algorithms have been proposed in the literature. In this work, we study the effect of adversarial attacks on a network that employs a consensus-based MARL algorithm. We show that an adversarial agent can persuade all the other agents in the network to implement policies that optimize an objective that it desires. In this sense, the standard consensus-based MARL algorithms are fragile to attacks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
