Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance
Li Xia

TL;DR
This paper addresses the challenge of optimizing long-term mean-variance performance in Markov decision processes by developing a sensitivity-based approach, an iterative policy algorithm, and demonstrating its application to energy storage systems.
Contribution
It introduces a new sensitivity-based optimization framework for mean-variance MDPs, including a difference formula, optimality conditions, and a convergent policy iteration algorithm.
Findings
The difference formula quantifies policy performance differences.
The algorithm converges to local optima, globally if mean reward is constant.
Application to wind power fluctuation reduction shows practical relevance.
Abstract
This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together. Such performance metric is important since the mean indicates average returns and the variance indicates risk or fairness. However, the variance metric couples the rewards at all stages, the traditional dynamic programming is inapplicable as the principle of time consistency fails. We study this problem from a new perspective called the sensitivity-based optimization theory. A performance difference formula is derived and it can quantify the difference of the mean-variance combined metrics of MDPs under any two different policies. The difference formula can be utilized to generate new policies with strictly improved mean-variance performance. A necessary condition of the optimal policy and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Smart Grid Energy Management · Electric Vehicles and Infrastructure
