Adversarial Policies: Attacking Deep Reinforcement Learning
Adam Gleave, Michael Dennis, Cody Wild, Neel Kant, Sergey Levine,, Stuart Russell

TL;DR
This paper demonstrates that in multi-agent environments, adversarial policies can be learned to reliably attack deep reinforcement learning agents by creating natural, adversarial observations, even without directly modifying the observations.
Contribution
It introduces the concept of adversarial policies in multi-agent RL, showing they can reliably deceive robust agents in high-dimensional environments.
Findings
Adversarial policies can reliably win against robust RL agents.
They induce different neural activations compared to normal opponents.
High-dimensional environments increase the success of adversarial policies.
Abstract
Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers. However, an attacker is not usually able to directly modify another agent's observations. This might lead one to wonder: is it possible to attack an RL agent simply by choosing an adversarial policy acting in a multi-agent environment so as to create natural observations that are adversarial? We demonstrate the existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents. The adversarial policies reliably win against the victims but generate seemingly random and uncoordinated behavior. We find that these policies are more successful in high-dimensional environments, and induce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Anomaly Detection Techniques and Applications
