Reward Certification for Policy Smoothed Reinforcement Learning

Ronghui Mu; Leandro Soriano Marcolino; Tianle Zhang; Yanghao Zhang,; Xiaowei Huang; Wenjie Ruan

arXiv:2312.06436·cs.LG·December 13, 2023·1 cites

Reward Certification for Policy Smoothed Reinforcement Learning

Ronghui Mu, Leandro Soriano Marcolino, Tianle Zhang, Yanghao Zhang,, Xiaowei Huang, Wenjie Ruan

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a black-box certification method for reinforcement learning policies that guarantees reward bounds under various norm-bounded perturbations, enhancing robustness and efficiency over existing techniques.

Contribution

It proposes a novel, general certification approach using f-divergence for reward guarantees in smoothed RL policies, extending to action space perturbations.

Findings

01

Improves the certified lower bound of mean cumulative reward.

02

Demonstrates better efficiency than state-of-the-art methods.

03

Validates effectiveness through experiments in multiple environments.

Abstract

Reinforcement Learning (RL) has achieved remarkable success in safety-critical areas, but it can be weakened by adversarial attacks. Recent studies have introduced "smoothed policies" in order to enhance its robustness. Yet, it is still challenging to establish a provable guarantee to certify the bound of its total reward. Prior methods relied primarily on computing bounds using Lipschitz continuity or calculating the probability of cumulative reward above specific thresholds. However, these techniques are only suited for continuous perturbations on the RL agent's observations and are restricted to perturbations bounded by the $l_{2}$ -norm. To address these limitations, this paper proposes a general black-box certification method capable of directly certifying the cumulative reward of the smoothed policy under various $l_{p}$ -norm bounded perturbations. Furthermore, we extend our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

trustai/receps
pytorchOfficial

Videos

Reward Certification for Policy Smoothed Reinforcement Learning· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics