On the existence of optimal stationary policies for average Markov decision processes with countable states
Li Xia, Xianping Guo, Xi-Ren Cao

TL;DR
This paper establishes conditions under which optimal stationary policies exist for countable-state Markov decision processes with average costs, even when costs are unbounded, by leveraging metric space compactness and continuity.
Contribution
It introduces a new framework using metric space properties to guarantee the existence of optimal stationary policies in countable-state MDPs with unbounded costs.
Findings
Optimal stationary policies exist under compactness and continuity conditions.
The framework applies to systems like queueing models with unbounded costs.
Examples demonstrate the practical applicability of the theoretical results.
Abstract
For a Markov decision process with countably infinite states, the optimal value may not be achievable in the set of stationary policies. In this paper, we study the existence conditions of an optimal stationary policy in a countable-state Markov decision process under the long-run average criterion. With a properly defined metric on the policy space of ergodic MDPs, the existence of an optimal stationary policy can be guaranteed by the compactness of the space and the continuity of the long-run average cost with respect to the metric. We further extend this condition by some assumptions which can be easily verified in control problems of specific systems, such as queueing systems. Our results make a complementary contribution to the literature in the sense that our method is capable to handle the cost function unbounded from both below and above, only at the condition of continuity and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Electric Vehicles and Infrastructure · Reinforcement Learning in Robotics
