The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes
Stephane Le Roux, Guillermo A. Perez

TL;DR
This paper investigates the computational complexity of the never-worse relation in Markov decision processes, showing it is coNP-complete, and extends an algorithm to approximate it.
Contribution
It establishes the coNP-completeness of computing the NWR and extends an existing algorithm to better approximate this relation.
Findings
NWR is coNP-complete to compute.
States in the same NWR class can be collapsed to simplify models.
An extended algorithm can under-approximate the NWR more effectively.
Abstract
We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regard- less of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub- optimal states can be removed. We show the natural decision problem associated to computing the NWR is coNP-complete. Finally, we ex- tend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Formal Methods in Verification · Distributed systems and fault tolerance
