Optimal Perturbation Budget Allocation for Data Poisoning in Offline Reinforcement Learning

Junnan Qiu; Yuanjie Zhao; Jie Li

arXiv:2512.08485·cs.LG·December 11, 2025

Optimal Perturbation Budget Allocation for Data Poisoning in Offline Reinforcement Learning

Junnan Qiu, Yuanjie Zhao, Jie Li

PDF

Open Access

TL;DR

This paper introduces a global budget allocation attack for offline RL data poisoning, optimizing perturbations based on TD-error influence to maximize attack efficiency and stealthiness.

Contribution

It proposes a novel global resource allocation method for data poisoning in offline RL, leveraging TD-error sensitivity for more effective and stealthy attacks.

Findings

01

Achieves up to 80% performance degradation

02

Outperforms baseline strategies significantly

03

Evades state-of-the-art detection methods

Abstract

Offline Reinforcement Learning (RL) enables policy optimization from static datasets but is inherently vulnerable to data poisoning attacks. Existing attack strategies typically rely on locally uniform perturbations, which treat all samples indiscriminately. This approach is inefficient, as it wastes the perturbation budget on low-impact samples, and lacks stealthiness due to significant statistical deviations. In this paper, we propose a novel Global Budget Allocation attack strategy. Leveraging the theoretical insight that a sample's influence on value function convergence is proportional to its Temporal Difference (TD) error, we formulate the attack as a global resource allocation problem. We derive a closed-form solution where perturbation magnitudes are assigned proportional to the TD-error sensitivity under a global L2 constraint. Empirical results on D4RL benchmarks demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Privacy-Preserving Technologies in Data