Distributionally Safe Reinforcement Learning under Model Uncertainty: A Single-Level Approach by Differentiable Convex Programming
Alaa Eddine Chriat, Chuangchuang Sun

TL;DR
This paper introduces a novel, tractable reinforcement learning framework that ensures safety under distributional shifts by transforming a bi-level problem into a single-level, differentiable convex optimization problem, improving safety guarantees in uncertain environments.
Contribution
The paper presents the first single-level, differentiable convex programming approach for distributionally safe reinforcement learning under model uncertainty, enhancing safety guarantees.
Findings
Significant safety improvements over uncertainty-agnostic policies.
Effective handling of distributional shifts using Wasserstein metric.
Tractable end-to-end differentiable safety enforcement.
Abstract
Safety assurance is uncompromisable for safety-critical environments with the presence of drastic model uncertainties (e.g., distributional shift), especially with humans in the loop. However, incorporating uncertainty in safe learning will naturally lead to a bi-level problem, where at the lower level the (worst-case) safety constraint is evaluated within the uncertainty ambiguity set. In this paper, we present a tractable distributionally safe reinforcement learning framework to enforce safety under a distributional shift measured by a Wasserstein metric. To improve the tractability, we first use duality theory to transform the lower-level optimization from infinite-dimensional probability space where distributional shift is measured, to a finite-dimensional parametric space. Moreover, by differentiable convex programming, the bi-level safe learning problem is further reduced to a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOccupational Health and Safety Research · Risk and Safety Analysis · Probabilistic and Robust Engineering Design
