Markov Decision Processes with Multiple Long-run Average Objectives
Tom\'a\v{s} Br\'azdil (Faculty of Informatics, Masaryk University),, V\'aclav Bro\v{z}ek (Faculty of Informatics, Masaryk University), Krishnendu, Chatterjee (IST Austria), Vojt\v{e}ch Forejt (Department of Computer Science,, Oxford University)

TL;DR
This paper analyzes Markov decision processes with multiple long-run average objectives, revealing the complexity of strategies needed and providing polynomial-time solutions for decision problems and Pareto curve approximation.
Contribution
It introduces a comprehensive analysis of strategies for MDPs with multiple mean-payoff functions, correcting previous flaws and offering new polynomial-time algorithms.
Findings
Randomization and memory are necessary for expectation objectives.
Infinite memory is required for certain satisfaction objectives.
Decision problems can be solved in polynomial time with epsilon-approximation.
Abstract
We study Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) functions. We consider two different objectives, namely, expectation and satisfaction objectives. Given an MDP with k limit-average functions, in the expectation objective the goal is to maximize the expected limit-average value, and in the satisfaction objective the goal is to maximize the probability of runs such that the limit-average value stays above a given vector. We show that under the expectation objective, in contrast to the case of one limit-average function, both randomization and memory are necessary for strategies even for epsilon-approximation, and that finite-memory randomized strategies are sufficient for achieving Pareto optimal values. Under the satisfaction objective, in contrast to the case of one limit-average function, infinite memory is necessary for strategies achieving a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
