Understanding the Limits of Poisoning Attacks in Episodic Reinforcement Learning
Anshuka Rangi, Haifeng Xu, Long Tran-Thanh, Massimo Franceschetti

TL;DR
This paper investigates the effectiveness of poisoning attacks on episodic reinforcement learning algorithms, revealing that attack success depends on reward bounds and providing bounds on attack costs.
Contribution
It characterizes the limits of reward and action poisoning attacks in episodic RL, showing when such attacks can successfully manipulate optimal policies.
Findings
In bounded reward settings, combined reward and action manipulation can successfully attack any optimal policy.
In unbounded reward settings, reward manipulation alone suffices for successful attacks.
Attack costs are order-optimal at rac12; T for successful policy manipulation.
Abstract
To understand the security threats to reinforcement learning (RL) algorithms, this paper studies poisoning attacks to manipulate \emph{any} order-optimal learning algorithm towards a targeted policy in episodic RL and examines the potential damage of two natural types of poisoning attacks, i.e., the manipulation of \emph{reward} and \emph{action}. We discover that the effect of attacks crucially depend on whether the rewards are bounded or unbounded. In bounded reward settings, we show that only reward manipulation or only action manipulation cannot guarantee a successful attack. However, by combining reward and action manipulation, the adversary can manipulate any order-optimal learning algorithm to follow any targeted policy with total attack cost, which is order-optimal, without any knowledge of the underlying MDP. In contrast, in unbounded reward settings,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
