On the Complexity of Robust Markov Decision Processes and Bisimulation Metrics
Marnix Suilen, Guillermo A. P\'erez

TL;DR
This paper investigates the computational complexity of robust Markov decision processes with polytopic uncertainty sets, providing complexity classifications, reductions to known problems, and practical algorithms for policy evaluation.
Contribution
It offers new complexity results for RMDPs, including polynomial-time algorithms for certain cases and reductions to parity games and bisimulation metrics.
Findings
Robust policy evaluation for (s,a)-rectangular RMDPs is in P.
Threshold problem for s-rectangular RMDPs is in PSPACE.
Reductions link RMDPs to parity games and bisimulation metrics.
Abstract
Robust Markov decision processes (RMDPs) extend standard Markov decision processes (MDPs) to account for uncertainty in the transition probabilities. RMDPs have an uncertainty set that defines a set of possible transition functions, each of which induces a standard MDP. The natural objective in an RMDP is to optimize the discounted cumulative reward under the worst-case transition function in the uncertainty set. We study the complexity of the associated threshold problem for RMDPs with polytopic uncertainty sets in halfspace representation. Previous results focused on approximating the optimum or restricted attention to specific subclasses of RMDPs, such as interval MDPs or -RMDPs. Our contributions are threefold: (1) For (s,a)-rectangular RMDPs, we prove that robust policy evaluation is in P via robust linear programming, and that the threshold problem is in NP. As a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
