Risk-aware Markov Decision Processes Using Cumulative Prospect Theory
Thomas Brihaye, Krishnendu Chatterjee, Stefanie Mohr, Maximilian Weininger

TL;DR
This paper extends cumulative prospect theory (CPT) to sequential decision-making models like Markov chains and MDPs, providing algorithms for computing CPT-values and analyzing strategy complexity.
Contribution
It introduces a new perspective on CPT-values in MCs and MDPs, linking them to multi-objective reachability and establishing strategy optimality conditions.
Findings
Memoryless randomized strategies are sufficient for optimality.
The CPT-value computation in MDPs is in EXPTIME and fixed-parameter tractable.
A polynomial-time algorithm is provided for Markov chains.
Abstract
Cumulative prospect theory (CPT) is the first theory for decision-making under uncertainty that combines full theoretical soundness and empirically realistic features [P.P. Wakker - Prospect theory: For risk and ambiguity, Page 2]. While CPT was originally considered in one-shot settings for risk-aware decision-making, we consider CPT in sequential decision-making. The most fundamental and well-studied models for sequential decision-making are Markov chains (MCs), and their generalization Markov decision processes (MDPs). The complexity theoretic study of MCs and MDPs with CPT is a fundamental problem that has not been addressed in the literature. Our contributions are as follows: First, we present an alternative viewpoint for the CPT-value of MCs and MDPs. This allows us to establish a connection with multi-objective reachability analysis and conclude the strategy complexity result…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · AI-based Problem Solving and Planning · Bayesian Modeling and Causal Inference
