Robust Almost-Sure Reachability in Multi-Environment MDPs
Marck van der Vegt, Nils Jansen, Sebastian Junges

TL;DR
This paper studies the problem of synthesizing policies that guarantee almost-sure reachability across multiple environments modeled as MEMDPs, providing complexity results and an algorithm with promising practical performance.
Contribution
It establishes the PSPACE-completeness of robust policy synthesis in MEMDPs and introduces an algorithm for practical policy synthesis.
Findings
PSPACE-complete complexity for MEMDPs
Exponential memory requirements for policies
Promising results on benchmark problems
Abstract
Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, one policy that satisfies a property for all environments. For POMDPs, deciding the existence of robust policies is an EXPTIME-complete problem. In this paper, we show that this problem is PSPACE-complete for MEMDPs, while the policies in general require exponential memory. We exploit the theoretical results to develop and implement an algorithm that shows promising results in synthesizing robust policies for various benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification
