Sure-almost-sure and Sure-limit-sure Window Mean Payoff in Markov Decision Processes
Pranshu Gaba, Shibashis Guha

TL;DR
This paper addresses the computational complexity and strategy construction for sure-almost-sure and sure-limit-sure window mean-payoff objectives in Markov decision processes, providing complexity classifications and memory bounds.
Contribution
It solves the sure-almost-sure and sure-limit-sure problems for window mean-payoff objectives, establishing complexity results and strategy memory bounds.
Findings
Both problems are in P for fixed window length (if given in unary).
Both problems are in NP ∩ coNP for the bounded window length variant.
The paper provides bounds on the memory required for winning strategies.
Abstract
Given rationals and , the sure-almost-sure problem for a quantitative objective in a Markov decision process (MDP) asks if one can simultaneously ensure that all outcomes of the MDP have -value at least (i.e. sure satisfaction) and with probability the outcome has -value at least (i.e. almost-sure satisfaction). The sure-limit-sure problem asks if for all one can simultaneously ensure that all outcomes have -value at least and with probability at least the outcome has -value at least . Moreover, if simultaneous satisfaction of objectives is possible, then one would also like to construct a strategy (for sure-almost-sure) or a family of strategies (for sure-limit-sure) that achieves this. In this paper, we solve the sure-almost-sure and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
