Controlled Markov Chains with AVaR Criteria for Unbounded Costs
Kerem Ugurlu

TL;DR
This paper develops a method for solving infinite horizon control problems in Markov Decision Processes with unbounded costs using AVaR criteria, establishing the existence of optimal policies and deriving dynamic programming equations.
Contribution
It introduces a novel approach to handle unbounded costs in MDPs with AVaR, including the derivation of dynamic programming equations for the first time.
Findings
Existence of optimal policies for unbounded cost MDPs under AVaR.
Derivation of dynamic programming equations with $L^1$-unbounded costs.
Application of state aggregation and heuristic global variable selection.
Abstract
In this paper, we consider the control problem with the Average-Value-at-Risk (AVaR) criteria of the possibly unbounded -costs in infinite horizon on a Markov Decision Process (MDP). With a suitable state aggregation and by choosing a priori a global variable heuristically, we show that there exist optimal policies for the infinite horizon problem. To our knowledge, this is the first work of deriving dynamic programming equations with -unbounded costs via AVaR-operator.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Portfolio Optimization
