COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks
Fan Wu, Linyi Li, Chejian Xu, Huan Zhang, Bhavya Kailkhura, Krishnaram, Kenthapadi, Ding Zhao, Bo Li

TL;DR
This paper introduces COPA, a novel certification framework that assesses and guarantees the robustness of offline reinforcement learning policies against poisoning attacks on training data, using new criteria and protocols.
Contribution
COPA is the first framework to certify robustness of offline RL policies against poisoning, with new certification criteria and protocols that improve guarantees and efficiency.
Findings
Robust aggregation protocols significantly enhance certification.
Certification methods are efficient and tight.
Robustness varies across algorithms and environments.
Abstract
As reinforcement learning (RL) has achieved near human-level performance in a variety of tasks, its robustness has raised great attention. While a vast body of research has explored test-time (evasion) attacks in RL and corresponding defenses, its robustness against training-time (poisoning) attacks remains largely unanswered. In this work, we focus on certifying the robustness of offline RL in the presence of poisoning attacks, where a subset of training trajectories could be arbitrarily manipulated. We propose the first certification framework, COPA, to certify the number of poisoning trajectories that can be tolerated regarding different certification criteria. Given the complex structure of RL, we propose two certification criteria: per-state action stability and cumulative reward bound. To further improve the certification, we propose new partition and aggregation protocols to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Ethics and Social Impacts of AI
