Average-Cost Markov Decision Processes with Weakly Continuous Transition   Probabilities

Eugene A. Feinberg; Pavlo O. Kasyanov; Nina V. Zadoianchuk

arXiv:1202.4122·math.OC·February 21, 2012·1 cites

Average-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Eugene A. Feinberg, Pavlo O. Kasyanov, Nina V. Zadoianchuk

PDF

Open Access

TL;DR

This paper establishes conditions under which stationary optimal policies exist for average-cost Markov Decision Processes with weakly continuous transition probabilities, even with unbounded costs and noncompact action sets.

Contribution

It provides general sufficient conditions for the existence of stationary optimal policies and characterizes their properties in complex MDP settings.

Findings

01

Conditions for existence of stationary policies

02

Properties of value functions and optimal actions

03

Approximation of average-cost policies by discount policies

Abstract

This paper presents sufficient conditions for the existence of stationary optimal policies for average-cost Markov Decision Processes with Borel state and action sets and with weakly continuous transition probabilities. The one-step cost functions may be unbounded, and action sets may be noncompact. The main contributions of this paper are: (i) general sufficient conditions for the existence of stationary discount-optimal and average-cost optimal policies and descriptions of properties of value functions and sets of optimal actions, (ii) a sufficient condition for the average-cost optimality of a stationary policy in the form of optimality inequalities, and (iii) approximations of average-cost optimal actions by discount-optimal actions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEconomic theories and models · Reinforcement Learning in Robotics · Stochastic processes and financial applications