Constructing MDP Abstractions Using Data with Formal Guarantees
Abolfazl Lavaei, Sadegh Soudjani, Emilio Frazzoli, Majid Zamani

TL;DR
This paper introduces a data-driven method to construct finite Markov decision processes as abstractions of unknown stochastic systems, providing formal guarantees on their closeness with high confidence.
Contribution
It develops a novel approach using stochastic bisimulation functions and scenario convex programs to create data-driven MDPs with formal proximity guarantees.
Findings
Successfully applied to a nonlinear jet engine compressor
Constructed a data-driven MDP with probabilistic safety guarantees
Demonstrated effectiveness in controller synthesis for safety
Abstract
This paper is concerned with a data-driven technique for constructing finite Markov decision processes (MDPs) as finite abstractions of discrete-time stochastic control systems with unknown dynamics while providing formal closeness guarantees. The proposed scheme is based on notions of stochastic bisimulation functions (SBF) to capture the probabilistic distance between state trajectories of an unknown stochastic system and those of finite MDP. In our proposed setting, we first reformulate corresponding conditions of SBF as a robust convex program (RCP). We then propose a scenario convex program (SCP) associated to the original RCP by collecting a finite number of data from trajectories of the system. We ultimately construct an SBF between the data-driven finite MDP and the unknown stochastic system with a given confidence level by establishing a probabilistic relation between optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
