Risk-averse optimization of total rewards in Markovian models using   deviation measures

Christel Baier; Jakob Piribauer; Maximilian Starke

arXiv:2407.06887·cs.LO·July 10, 2024

Risk-averse optimization of total rewards in Markovian models using deviation measures

Christel Baier, Jakob Piribauer, Maximilian Starke

PDF

1 Repo

TL;DR

This paper explores risk-averse reward optimization in Markov decision processes by analyzing deviation measures like semi-variance and MAD, aiming to improve upon variance-based methods and develop effective algorithms.

Contribution

It introduces and analyzes alternative deviation measures for risk-averse optimization in MDPs, addressing limitations of variance-based approaches and providing properties of optimal policies.

Findings

01

Semi-variance and MAD improve risk-averse behavior.

02

Optimal schedulers for these measures avoid discouraging reward accumulation.

03

Algorithms are developed for MDPs and Markov chains.

Abstract

This paper addresses objectives tailored to the risk-averse optimization of accumulated rewards in Markov decision processes (MDPs). The studied objectives require maximizing the expected value of the accumulated rewards minus a penalty factor times a deviation measure of the resulting distribution of rewards. Using the variance in this penalty mechanism leads to the variance-penalized expectation (VPE) for which it is known that optimal schedulers have to minimize future expected rewards when a high amount of rewards has been accumulated. This behavior is undesirable as risk-averse behavior should keep the probability of particularly low outcomes low, but not discourage the accumulation of additional rewards on already good executions. The paper investigates the semi-variance, which only takes outcomes below the expected value into account, the mean absolute deviation (MAD), and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

experiments-collection/risk-averse-stochastic-shortest-paths
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.