Beyond discounted returns: Robust Markov decision processes with average and Blackwell optimality
Julien Grand-Cl\'ement, Marek Petrik, Nicolas Vieille

TL;DR
This paper advances the theory of Robust Markov Decision Processes by exploring average and Blackwell optimality, revealing new existence results, and proposing algorithms, with implications for decision-making under uncertainty.
Contribution
It provides foundational results for RMDPs beyond discounted returns, including existence conditions for average and Blackwell optimal policies, and introduces algorithms leveraging stochastic game connections.
Findings
Average optimal policies can be stationary and deterministic for sa-rectangular RMDPs.
Average optimal policies may not exist or may need to be history-dependent for s-rectangular RMDPs.
Epsilon-Blackwell optimal policies always exist under certain conditions.
Abstract
Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for average optimality (optimizing the long-run average of the rewards obtained over time) and Blackwell optimality (remaining discount optimal for all discount factors sufficiently close to ). In this paper, we prove several foundational results for RMDPs beyond the discounted return. We show that average optimal policies can be chosen stationary and deterministic for sa-rectangular RMDPs but, perhaps surprisingly, we show that for s-rectangular RMDPs average optimal policies may not exist, and if they exist, may need to be history-dependent (Markovian). We also study Blackwell optimality for sa-rectangular RMDPs, where we show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Reinforcement Learning in Robotics · Auction Theory and Applications
