Online Poisoning Attack Against Reinforcement Learning under Black-box   Environments

Jianhui Li; Bokang Zhang; Junfeng Wu

arXiv:2412.00797·cs.LG·December 3, 2024

Online Poisoning Attack Against Reinforcement Learning under Black-box Environments

Jianhui Li, Bokang Zhang, Junfeng Wu

PDF

Open Access

TL;DR

This paper introduces an online poisoning attack method targeting reinforcement learning in black-box environments, manipulating reward functions and state transitions to mislead the agent, validated through maze environment experiments.

Contribution

It presents a novel black-box poisoning attack algorithm for reinforcement learning, addressing unknown environment dynamics and using sample-based gradient estimation.

Findings

01

Effective poisoning demonstrated in maze environment

02

Algorithm successfully manipulates reward and transition data

03

Addresses challenges of unknown environment dynamics

Abstract

This paper proposes an online environment poisoning algorithm tailored for reinforcement learning agents operating in a black-box setting, where an adversary deliberately manipulates training data to lead the agent toward a mischievous policy. In contrast to prior studies that primarily investigate white-box settings, we focus on a scenario characterized by \textit{unknown} environment dynamics to the attacker and a \textit{flexible} reinforcement learning algorithm employed by the targeted agent. We first propose an attack scheme that is capable of poisoning the reward functions and state transitions. The poisoning task is formalized as a constrained optimization problem, following the framework of \cite{ma2019policy}. Given the transition probabilities are unknown to the attacker in a black-box environment, we apply a stochastic gradient descent algorithm, where the exact gradients…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsFocus