Discounted continuous-time constrained Markov decision processes in   Polish spaces

Xianping Guo; Xinyuan Song

arXiv:1201.0089·math.PR·January 4, 2012

Discounted continuous-time constrained Markov decision processes in Polish spaces

Xianping Guo, Xinyuan Song

PDF

TL;DR

This paper studies constrained continuous-time Markov decision processes with unbounded transition rates, rewards, and costs in Polish spaces, establishing conditions for optimal policies using occupation measures and linear programming.

Contribution

It introduces a new framework for constrained continuous-time MDPs in Polish spaces, including existence proofs and a linear programming approach for optimal policies.

Findings

01

Established conditions for nonexplosion and finiteness of rewards and costs.

02

Proved the existence of constrained optimal policies via occupation measures.

03

Provided a linear programming formulation for solving the constrained MDPs.

Abstract

This paper is devoted to studying constrained continuous-time Markov decision processes (MDPs) in the class of randomized policies depending on state histories. The transition rates may be unbounded, the reward and costs are admitted to be unbounded from above and from below, and the state and action spaces are Polish spaces. The optimality criterion to be maximized is the expected discounted rewards, and the constraints can be imposed on the expected discounted costs. First, we give conditions for the nonexplosion of underlying processes and the finiteness of the expected discounted rewards/costs. Second, using a technique of occupation measures, we prove that the constrained optimality of continuous-time MDPs can be transformed to an equivalent (optimality) problem over a class of probability measures. Based on the equivalent problem and a so-called $\overset{w}{ˉ}$ -weak convergence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.