Markov Decision Processes with Recursive Risk Measures

Nicole B\"auerle; Alexander Glauner

arXiv:2010.07220·math.OC·October 16, 2025·Eur. J. Oper. Res.

Markov Decision Processes with Recursive Risk Measures

Nicole B\"auerle, Alexander Glauner

PDF

TL;DR

This paper develops a framework for risk-sensitive Markov Decision Processes using recursive risk measures, deriving Bellman equations, proving optimal policy existence, and connecting to distributionally robust MDPs.

Contribution

It extends recursive risk measures to MDPs with unbounded costs, providing new theoretical results and a connection to robust optimization.

Findings

01

Existence of Markovian optimal policies

02

Bellman equation for recursive risk measures

03

Stationary optimal policies for infinite horizon

Abstract

In this paper, we consider risk-sensitive Markov Decision Processes (MDPs) with Borel state and action spaces and unbounded cost under both finite and infinite planning horizons. Our optimality criterion is based on the recursive application of static risk measures. This is motivated by recursive utilities in the economic literature, has been studied before for the entropic risk measure and is extended here to an axiomatic characterization of suitable risk measures. We derive a Bellman equation and prove the existence of Markovian optimal policies. For an infinite planning horizon, the model is shown to be contractive and the optimal policy to be stationary. Moreover, we establish a connection to distributionally robust MDPs, which provides a global interpretation of the recursively defined objective function. Monotone models are studied in particular.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.