Trading Performance for Stability in Markov Decision Processes
Tom\'a\v{s} Br\'azdil, Krishnendu Chatterjee, Vojt\v{e}ch Forejt,, Anton\'in Ku\v{c}era

TL;DR
This paper explores the complexity of controlling finite-state Markov decision processes to optimize both performance and stability, proposing new stability measures and analyzing their computational properties.
Contribution
It introduces alternative stability definitions (local and hybrid variance), analyzes the complexity of related decision problems, and provides algorithms for approximating the Pareto front.
Findings
Global variance decision problem is in PSPACE and approximable in pseudo-polynomial time.
Hybrid variance decision problem is in NP with polynomial-time approximation.
Local variance decision problem is in NP; special cases are solvable in polynomial time.
Abstract
We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively. We show that a strategy ensuring both the expected mean-payoff and the variance below given bounds requires randomization and memory, under all the above semantics of variance. We then look at the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Reinforcement Learning in Robotics · Petri Nets in System Modeling
