Interval POMDP Shielding for Imperfect-Perception Agents
William Scarbro, Ravi Mangal

TL;DR
This paper introduces an interval POMDP-based shielding method that ensures safety for perception-dependent autonomous systems with uncertain sensor data, providing high-probability safety guarantees.
Contribution
It develops a novel algorithm to compute conservative belief sets and a runtime shield with finite-horizon safety guarantees under perception uncertainty.
Findings
Improves safety over existing baselines in four case studies.
Provides high-probability safety guarantees based on confidence intervals.
Models perception uncertainty with finite intervals in an Interval POMDP framework.
Abstract
Autonomous systems that rely on learned perception can make unsafe decisions when sensor readings are misclassified. We study shielding for this setting: given a proposed action, a shield blocks actions that could violate safety. We consider the common case where system dynamics are known but perception uncertainty must be estimated from finite labeled data. From these data we build confidence intervals for the probabilities of perception outcomes and use them to model the system as a finite Interval Partially Observable Markov Decision Process with discrete states and actions. We then propose an algorithm to compute a conservative set of beliefs over the underlying state that is consistent with the observations seen so far. This enables us to construct a runtime shield that comes with a finite-horizon guarantee: with high probability over the training data, if the true perception…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
