Shields to Guarantee Probabilistic Safety in MDPs

Linus Heck; Filip Mac\'ak; Roman Andriushchenko; Milan \v{C}e\v{s}ka; Sebastian Junges

arXiv:2605.10888·cs.LO·May 14, 2026

Shields to Guarantee Probabilistic Safety in MDPs

Linus Heck, Filip Mac\'ak, Roman Andriushchenko, Milan \v{C}e\v{s}ka, Sebastian Junges

PDF

TL;DR

This paper extends classical shielding techniques to probabilistic safety in MDPs, providing formal frameworks, weaker guarantees, and practical shield constructions with empirical validation.

Contribution

It introduces a formal framework for probabilistic shielding, demonstrates the limitations of strong guarantees, and proposes new offline and online shield constructions.

Findings

01

Weakening safety guarantees enables probabilistic shielding.

02

Offline and online shields ensure strong safety guarantees.

03

Empirical evaluation shows practical feasibility and advantages.

Abstract

Shielding is a prominent model-based technique to ensure safety of autonomous agents. Classical shielding aims to ensure that nothing bad ever happens and comes with strong guarantees about safety and maximal permissiveness. However, shielding systems for probabilistic safety, where something bad is allowed to happen with an acceptable probability, has proven to be more intricate. This paper presents a formal framework that conservatively extends classical shields to probabilistic safety. In this framework, we (i) demonstrate the impossibility of preserving the strong guarantees on safety and permissiveness, (ii) provide natural shields with weaker guarantees, and (iii) introduce offline and online shield constructions ensuring strong safety guarantees. The empirical evaluation highlights the practical advantages of the new shields, as well as their computational feasibility.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.