Average-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities
Eugene A. Feinberg, Pavlo O. Kasyanov, Nina V. Zadoianchuk

TL;DR
This paper establishes conditions under which stationary optimal policies exist for average-cost Markov Decision Processes with weakly continuous transition probabilities, even with unbounded costs and noncompact action sets.
Contribution
It provides general sufficient conditions for the existence of stationary optimal policies and characterizes their properties in complex MDP settings.
Findings
Conditions for existence of stationary policies
Properties of value functions and optimal actions
Approximation of average-cost policies by discount policies
Abstract
This paper presents sufficient conditions for the existence of stationary optimal policies for average-cost Markov Decision Processes with Borel state and action sets and with weakly continuous transition probabilities. The one-step cost functions may be unbounded, and action sets may be noncompact. The main contributions of this paper are: (i) general sufficient conditions for the existence of stationary discount-optimal and average-cost optimal policies and descriptions of properties of value functions and sets of optimal actions, (ii) a sufficient condition for the average-cost optimality of a stationary policy in the form of optimality inequalities, and (iii) approximations of average-cost optimal actions by discount-optimal actions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic theories and models · Reinforcement Learning in Robotics · Stochastic processes and financial applications
