Data-Driven Synthesis of Probabilistic Controlled Invariant Sets for Linear MDPs
Kazumune Hashimoto, Shunki Kimura, Kazunobu Serizawa, Junya Ikemoto, Yulong Gao, Kai Cai

TL;DR
This paper presents a data-driven method for computing probabilistic controlled invariant sets in linear MDPs, enabling safety guarantees in reinforcement learning with unknown dynamics.
Contribution
It introduces a novel conservative approximation scheme for PCIS using regularized least squares, confidence bounds, and lattice discretization, with practical shielding applications.
Findings
Constructs conservative PCIS with confidence guarantees
Provides a tractable approximation via Lipschitz discretization
Demonstrates effectiveness in a numerical experiment
Abstract
We study data-driven computation of probabilistic controlled invariant sets (PCIS) for safety-critical reinforcement learning under unknown dynamics. Assuming a linear MDP model, we use regularized least squares and self-normalized confidence bounds to construct a conservative estimate of the states from which the system can be kept inside a prescribed safe region over an \(N\)-step horizon, together with the corresponding set-valued safe action map. This construction is obtained through a backward recursion and can be interpreted as a conservative approximation of the \(N\)-step safety predecessor operator. When the associated conservative-inclusion event holds, a conservative fixed point of the approximate recursion can be certified as an \((N,\epsilon)\)-PCIS with confidence at least \(\eta\). For continuous state spaces, we introduce a lattice abstraction and a Lipschitz-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
