Data-Driven Robust Safety Verification for Markov Decision Processes
Abhijit Mazumdar, Manuela L. Bujorianu, and Rafal Wisniewski

TL;DR
This paper introduces a data-driven framework for verifying safety in stochastic systems modeled as Markov decision processes, accounting for uncertainties in transition probabilities using samples and Wasserstein ambiguity sets.
Contribution
It develops a novel safety verification method that handles uncertain, time-varying transition kernels with finite data, providing high-confidence safety guarantees.
Findings
Effective safety guarantees under transition uncertainty
Unified ambiguity set capturing variability and statistical uncertainty
Numerical example demonstrating practical applicability
Abstract
In this paper, we propose a data-driven robust safety verification framework for stochastic dynamical systems modeled as Markov decision processes with time-varying and uncertain transition probabilities. Rather than assuming access to the exact nominal transition kernel, we consider the realistic setting where only samples from multiple system executions are available. These samples may correspond to different transition models inside an ambiguity set around the nominal transition kernel. Using these observations, we construct a unified ambiguity set that captures both inherent run-to-run variability in the transition dynamics and finite-sample statistical uncertainty. This ambiguity set is formalized through a Wasserstein-distance ball around a nominal empirical distribution and naturally induces an interval Markov decision process representation of the underlying system. Within this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Adversarial Robustness in Machine Learning · Risk and Portfolio Optimization
