Risk-Sensitive Markov Decision Processes with Long-Run CVaR Criterion

Li Xia; Peter W. Glynn

arXiv:2210.08740·math.OC·October 18, 2022

Risk-Sensitive Markov Decision Processes with Long-Run CVaR Criterion

Li Xia, Peter W. Glynn

PDF

Open Access

TL;DR

This paper develops a novel framework for optimizing long-run CVaR in Markov decision processes, introducing new formulas, optimality conditions, and an algorithm with applications in portfolio management.

Contribution

It introduces a pseudo CVaR metric, a CVaR difference formula, and a policy iteration algorithm for long-run CVaR optimization in MDPs, addressing a challenging risk metric.

Findings

01

Derived a CVaR difference formula for policy comparison.

02

Established a Bellman local optimality equation for CVaR.

03

Developed a convergent policy iteration algorithm for CVaR optimization.

Abstract

CVaR (Conditional Value at Risk) is a risk metric widely used in finance. However, dynamically optimizing CVaR is difficult since it is not a standard Markov decision process (MDP) and the principle of dynamic programming fails. In this paper, we study the infinite-horizon discrete-time MDP with a long-run CVaR criterion, from the view of sensitivity-based optimization. By introducing a pseudo CVaR metric, we derive a CVaR difference formula which quantifies the difference of long-run CVaR under any two policies. The optimality of deterministic policies is derived. We obtain a so-called Bellman local optimality equation for CVaR, which is a necessary and sufficient condition for local optimal policies and only necessary for global optimal policies. A CVaR derivative formula is also derived for providing more sensitivity information. Then we develop a policy iteration type algorithm to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Economic theories and models