The Complexity of Graph-Based Reductions for Reachability in Markov   Decision Processes

Stephane Le Roux; Guillermo A. Perez

arXiv:1710.07903·cs.LO·February 27, 2018

The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

Stephane Le Roux, Guillermo A. Perez

PDF

Open Access

TL;DR

This paper investigates the computational complexity of the never-worse relation in Markov decision processes, showing it is coNP-complete, and extends an algorithm to approximate it.

Contribution

It establishes the coNP-completeness of computing the NWR and extends an existing algorithm to better approximate this relation.

Findings

01

NWR is coNP-complete to compute.

02

States in the same NWR class can be collapsed to simplify models.

03

An extended algorithm can under-approximate the NWR more effectively.

Abstract

We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regard- less of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub- optimal states can be removed. We show the natural decision problem associated to computing the NWR is coNP-complete. Finally, we ex- tend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Formal Methods in Verification · Distributed systems and fault tolerance