Bounded Robustness in Reinforcement Learning via Lexicographic   Objectives

Daniel Jarne Ornia; Licio Romao; Lewis Hammond; Manuel Mazo Jr.,; Alessandro Abate

arXiv:2209.15320·cs.LG·December 12, 2023

Bounded Robustness in Reinforcement Learning via Lexicographic Objectives

Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr.,, Alessandro Abate

PDF

Open Access

TL;DR

This paper introduces a formal framework for enhancing policy robustness in reinforcement learning by analyzing noise effects through a stochastic operator, and proposes a lexicographic optimization scheme to balance robustness and utility.

Contribution

It provides a novel theoretical analysis linking robustness to noise properties and introduces a general robustness-inducing method for policy gradient algorithms.

Findings

01

Established connections between noise characteristics and policy robustness.

02

Derived sufficient conditions for robustness in policies.

03

Proposed a lexicographic optimization scheme that balances robustness and utility.

Abstract

Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator interpretation of the disturbances, and establish connections between robustness and properties of the noise kernel and of the underlying MDPs. Then, we construct sufficient conditions for policy robustness, and propose a robustness-inducing scheme, applicable to any policy gradient algorithm, that formally trades off expected policy utility for robustness through lexicographic optimisation, while preserving convergence and sub-optimality in the policy synthesis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Adversarial Robustness in Machine Learning

MethodsTest