Data-Driven Synthesis of Probabilistic Controlled Invariant Sets for Linear MDPs

Kazumune Hashimoto; Shunki Kimura; Kazunobu Serizawa; Junya Ikemoto; Yulong Gao; Kai Cai

arXiv:2604.02727·eess.SY·April 6, 2026

Data-Driven Synthesis of Probabilistic Controlled Invariant Sets for Linear MDPs

Kazumune Hashimoto, Shunki Kimura, Kazunobu Serizawa, Junya Ikemoto, Yulong Gao, Kai Cai

PDF

TL;DR

This paper presents a data-driven method for computing probabilistic controlled invariant sets in linear MDPs, enabling safety guarantees in reinforcement learning with unknown dynamics.

Contribution

It introduces a novel conservative approximation scheme for PCIS using regularized least squares, confidence bounds, and lattice discretization, with practical shielding applications.

Findings

01

Constructs conservative PCIS with confidence guarantees

02

Provides a tractable approximation via Lipschitz discretization

03

Demonstrates effectiveness in a numerical experiment

Abstract

We study data-driven computation of probabilistic controlled invariant sets (PCIS) for safety-critical reinforcement learning under unknown dynamics. Assuming a linear MDP model, we use regularized least squares and self-normalized confidence bounds to construct a conservative estimate of the states from which the system can be kept inside a prescribed safe region over an \(N\)-step horizon, together with the corresponding set-valued safe action map. This construction is obtained through a backward recursion and can be interpreted as a conservative approximation of the \(N\)-step safety predecessor operator. When the associated conservative-inclusion event holds, a conservative fixed point of the approximate recursion can be certified as an \((N,\epsilon)\)-PCIS with confidence at least \(\eta\). For continuous state spaces, we introduce a lattice abstraction and a Lipschitz-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.