Minimizing Expected Termination Time in One-Counter Markov Decision   Processes

Tom\'a\v{s} Br\'azdil; Anton\'in Ku\v{c}era; Petr Novotn\'y; Dominik; Wojtczak

arXiv:1205.1473·cs.FL·May 8, 2012

Minimizing Expected Termination Time in One-Counter Markov Decision Processes

Tom\'a\v{s} Br\'azdil, Anton\'in Ku\v{c}era, Petr Novotn\'y, Dominik, Wojtczak

PDF

Open Access

TL;DR

This paper addresses the challenge of approximating the minimal expected termination time in one-counter Markov decision processes, providing exponential-time algorithms and proving computational hardness for polynomial-time solutions.

Contribution

It introduces methods for approximating the value and strategies in one-counter MDPs and establishes complexity bounds and hardness results.

Findings

01

Exponential-time algorithms for approximation

02

Proof of NP-hardness for polynomial-time approximation

03

Strategies can be finitely represented within epsilon error

Abstract

We consider the problem of computing the value and an optimal strategy for minimizing the expected termination time in one-counter Markov decision processes. Since the value may be irrational and an optimal strategy may be rather complicated, we concentrate on the problems of approximating the value up to a given error epsilon > 0 and computing a finite representation of an epsilon-optimal strategy. We show that these problems are solvable in exponential time for a given configuration, and we also show that they are computationally hard in the sense that a polynomial-time approximation algorithm cannot exist unless P=NP.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods