Distributionally Safe Reinforcement Learning under Model Uncertainty: A   Single-Level Approach by Differentiable Convex Programming

Alaa Eddine Chriat; Chuangchuang Sun

arXiv:2310.02459·cs.LG·October 5, 2023·1 cites

Distributionally Safe Reinforcement Learning under Model Uncertainty: A Single-Level Approach by Differentiable Convex Programming

Alaa Eddine Chriat, Chuangchuang Sun

PDF

Open Access

TL;DR

This paper introduces a novel, tractable reinforcement learning framework that ensures safety under distributional shifts by transforming a bi-level problem into a single-level, differentiable convex optimization problem, improving safety guarantees in uncertain environments.

Contribution

The paper presents the first single-level, differentiable convex programming approach for distributionally safe reinforcement learning under model uncertainty, enhancing safety guarantees.

Findings

01

Significant safety improvements over uncertainty-agnostic policies.

02

Effective handling of distributional shifts using Wasserstein metric.

03

Tractable end-to-end differentiable safety enforcement.

Abstract

Safety assurance is uncompromisable for safety-critical environments with the presence of drastic model uncertainties (e.g., distributional shift), especially with humans in the loop. However, incorporating uncertainty in safe learning will naturally lead to a bi-level problem, where at the lower level the (worst-case) safety constraint is evaluated within the uncertainty ambiguity set. In this paper, we present a tractable distributionally safe reinforcement learning framework to enforce safety under a distributional shift measured by a Wasserstein metric. To improve the tractability, we first use duality theory to transform the lower-level optimization from infinite-dimensional probability space where distributional shift is measured, to a finite-dimensional parametric space. Moreover, by differentiable convex programming, the bi-level safe learning problem is further reduced to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOccupational Health and Safety Research · Risk and Safety Analysis · Probabilistic and Robust Engineering Design