Anytime Guarantees for Reachability in Uncountable Markov Decision   Processes

Kush Grover; Jan K\v{r}et\'insk\'y; Tobias Meggendorfer; Maximilian; Weininger

arXiv:2008.04824·eess.SY·July 13, 2022

Anytime Guarantees for Reachability in Uncountable Markov Decision Processes

Kush Grover, Jan K\v{r}et\'insk\'y, Tobias Meggendorfer, Maximilian, Weininger

PDF

TL;DR

This paper develops an anytime algorithm for approximating reachability probabilities in uncountable state and action space MDPs with theoretical guarantees, under minimal assumptions, enabling reliable analysis of complex systems.

Contribution

It introduces two algorithms providing converging bounds for reachability probabilities with minimal assumptions, advancing the reliability and generality of analysis in continuous-state MDPs.

Findings

01

Provides converging lower bounds under weak assumptions

02

Offers converging upper bounds with stronger assumptions

03

Enables iterative approximation with known accuracy improvements

Abstract

We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the "boundary" of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.