Policy Teaching via Environment Poisoning: Training-time Adversarial   Attacks against Reinforcement Learning

Amin Rakhsha; Goran Radanovic; Rati Devidze; Xiaojin Zhu; Adish Singla

arXiv:2003.12909·cs.LG·August 20, 2020·37 cites

Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates training-time adversarial attacks on reinforcement learning agents by poisoning the environment, demonstrating how attackers can stealthily manipulate the environment to force agents into specific target policies, revealing significant security vulnerabilities.

Contribution

The paper introduces an optimization framework for stealthy environment poisoning attacks on RL agents, providing theoretical bounds and demonstrating attack effectiveness in offline and online settings.

Findings

01

Attacker can successfully teach target policies under mild conditions.

02

Stealthy attacks can manipulate RL agents in both planning and learning scenarios.

03

Significant security threats identified for RL applications.

Abstract

We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker. As a victim, we consider RL agents whose objective is to find a policy that maximizes average reward in undiscounted infinite-horizon problem settings. The attacker can manipulate the rewards or the transition dynamics in the learning environment at training-time and is interested in doing so in a stealthy manner. We propose an optimization framework for finding an \emph{optimal stealthy attack} for different measures of attack cost. We provide sufficient technical conditions under which the attack is feasible and provide lower/upper bounds on the attack cost. We instantiate our attacks in two settings: (i) an \emph{offline} setting where the agent is doing planning in the poisoned environment, and (ii) an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adishs/icml2020_rl-policy-teaching_code
noneOfficial

Videos

Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Smart Grid Security and Resilience