Robust Offline Reinforcement Learning -- Certify the Confidence Interval

Jiarui Yao; Simon Shaolei Du

arXiv:2309.16631·cs.LG·October 4, 2023

Robust Offline Reinforcement Learning -- Certify the Confidence Interval

Jiarui Yao, Simon Shaolei Du

PDF

Open Access

TL;DR

This paper introduces a method to certify the robustness of offline reinforcement learning policies using random smoothing, providing theoretical guarantees and validating effectiveness through experiments.

Contribution

It develops a novel algorithm that certifies policy robustness in offline RL with rigorous theoretical analysis, unlike prior empirical approaches.

Findings

01

Algorithm effectively certifies policy robustness.

02

The method is computationally efficient.

03

Experimental results confirm correctness across environments.

Abstract

Currently, reinforcement learning (RL), especially deep RL, has received more and more attention in the research area. However, the security of RL has been an obvious problem due to the attack manners becoming mature. In order to defend against such adversarial attacks, several practical approaches are developed, such as adversarial training, data filtering, etc. However, these methods are mostly based on empirical algorithms and experiments, without rigorous theoretical analysis of the robustness of the algorithms. In this paper, we develop an algorithm to certify the robustness of a given policy offline with random smoothing, which could be proven and conducted as efficiently as ones without random smoothing. Experiments on different environments confirm the correctness of our algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning