Bounded Robustness in Reinforcement Learning via Lexicographic Objectives
Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr.,, Alessandro Abate

TL;DR
This paper introduces a formal framework for enhancing policy robustness in reinforcement learning by analyzing noise effects through a stochastic operator, and proposes a lexicographic optimization scheme to balance robustness and utility.
Contribution
It provides a novel theoretical analysis linking robustness to noise properties and introduces a general robustness-inducing method for policy gradient algorithms.
Findings
Established connections between noise characteristics and policy robustness.
Derived sufficient conditions for robustness in policies.
Proposed a lexicographic optimization scheme that balances robustness and utility.
Abstract
Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator interpretation of the disturbances, and establish connections between robustness and properties of the noise kernel and of the underlying MDPs. Then, we construct sufficient conditions for policy robustness, and propose a robustness-inducing scheme, applicable to any policy gradient algorithm, that formally trades off expected policy utility for robustness through lexicographic optimisation, while preserving convergence and sub-optimality in the policy synthesis.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Adversarial Robustness in Machine Learning
MethodsTest
