Minimizing Expected Termination Time in One-Counter Markov Decision Processes
Tom\'a\v{s} Br\'azdil, Anton\'in Ku\v{c}era, Petr Novotn\'y, Dominik, Wojtczak

TL;DR
This paper addresses the challenge of approximating the minimal expected termination time in one-counter Markov decision processes, providing exponential-time algorithms and proving computational hardness for polynomial-time solutions.
Contribution
It introduces methods for approximating the value and strategies in one-counter MDPs and establishes complexity bounds and hardness results.
Findings
Exponential-time algorithms for approximation
Proof of NP-hardness for polynomial-time approximation
Strategies can be finitely represented within epsilon error
Abstract
We consider the problem of computing the value and an optimal strategy for minimizing the expected termination time in one-counter Markov decision processes. Since the value may be irrational and an optimal strategy may be rather complicated, we concentrate on the problems of approximating the value up to a given error epsilon > 0 and computing a finite representation of an epsilon-optimal strategy. We show that these problems are solvable in exponential time for a given configuration, and we also show that they are computationally hard in the sense that a polynomial-time approximation algorithm cannot exist unless P=NP.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods
