Trace Refinement in Labelled Markov Decision Processes
Nathana\"el Fijalkow, Stefan Kiefer, Mahsa Shirmohammadi

TL;DR
This paper investigates the decidability and complexity of trace refinement in labelled Markov decision processes, providing polynomial-time algorithms for special cases and proving undecidability in general.
Contribution
It introduces new bisimulation techniques for distributions and resolves open questions on the decidability of trace refinement in MDPs.
Findings
Polynomial-time decidability when the second MDP is a Markov chain
Undecidability of the general trace-refinement problem
Decidability results for memoryless strategies
Abstract
Given two labelled Markov decision processes (MDPs), the trace-refinement problem asks whether for all strategies of the first MDP there exists a strategy of the second MDP such that the induced labelled Markov chains are trace-equivalent. We show that this problem is decidable in polynomial time if the second MDP is a Markov chain. The algorithm is based on new results on a particular notion of bisimulation between distributions over the states. However, we show that the general trace-refinement problem is undecidable, even if the first MDP is a Markov chain. Decidability of those problems was stated as open in 2008. We further study the decidability and complexity of the trace-refinement problem provided that the strategies are restricted to be memoryless.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
