Distributionally Robust Safety Verification for Markov Decision   Processes

Abhijit Mazumdar; Yuting Hou; Rafal Wisniewski

arXiv:2411.15622·eess.SY·November 27, 2024

Distributionally Robust Safety Verification for Markov Decision Processes

Abhijit Mazumdar, Yuting Hou, Rafal Wisniewski

PDF

Open Access

TL;DR

This paper introduces a distributionally robust safety verification approach for Markov decision processes using Wasserstein distance to handle ambiguous transition kernels, with a new robust safety function and a convex program-based Q-iteration algorithm.

Contribution

It develops a novel distributionally robust safety verification framework for MDPs with uncertain transition kernels, including a robust safety function and a convex optimization algorithm.

Findings

01

Derived an upper bound on the robust safety function.

02

Proposed a convex program-based robust Q-iteration algorithm.

03

Validated the approach with a numerical example.

Abstract

In this paper, we propose a distributionally robust safety verification method for Markov decision processes where only an ambiguous transition kernel is available instead of the precise transition kernel. We define the ambiguity set around the nominal distribution by considering a Wasserstein distance. To this end, we introduce a robust safety function to characterize probabilistic safety in the face of uncertain transition probability. First, we obtain an upper bound on the robust safety function in terms of a distributionally robust Q-function. Then, we present a convex program-based distributionally robust Q-iteration algorithm to compute the robust Q-function. By considering a numerical example, we demonstrate our theoretical results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Safety Analysis · Software Reliability and Analysis Research · Fault Detection and Control Systems

MethodsSparse Evolutionary Training