Continuous-time Markov decision processes with exponential utility

Yi Zhang

arXiv:1610.02844·math.OC·November 29, 2016·SIAM J. Control. Optim.

Continuous-time Markov decision processes with exponential utility

Yi Zhang

PDF

Open Access

TL;DR

This paper studies risk-sensitive continuous-time Markov decision processes with exponential utility, establishing optimality equations, existence of stationary policies, and a reduction to discrete-time models without growth restrictions.

Contribution

It introduces a novel reduction of risk-sensitive CTMDPs to discrete-time models, allowing value iteration without growth constraints on transition and cost rates.

Findings

01

Existence of deterministic stationary optimal policies.

02

Reduction of CTMDP to risk-sensitive discrete-time MDP.

03

Value iteration algorithm derived for the CTMDP.

Abstract

In this paper, we consider a continuous-time Markov decision process (CTMDP) in Borel spaces, where the certainty equivalent with respect to the exponential utility of the total undiscounted cost is to be minimized. The cost rate is nonnegative. We establish the optimality equation. Under the compactness-continuity condition, we show the existence of a deterministic stationary optimal policy. We reduce the risk-sensitive CTMDP problem to an equivalent risk-sensitive discrete-time Markov decision process, which is with the same state and action spaces as the original CTMDP. In particular, the value iteration algorithm for the CTMDP problem follows from this reduction. We do not need impose any condition on the growth of the transition and cost rate in the state, and the controlled process could be explosive.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Economic theories and models