Risk-Sensitive Partially Observable Markov Decision Processes as Fully Observable Multivariate Utility Optimization problems
Arsham Afsardeir, Andreas Kapetanis, Vaios Laschos, Klaus Obermayer

TL;DR
This paper introduces a novel algorithm for solving risk-sensitive POMDPs with finite state and observation spaces, extending exponential utility methods to sums of exponentials for broader utility functions.
Contribution
The paper presents a new algorithm that generalizes exponential utility approaches to sums of exponentials, enabling risk-sensitive POMDP solutions for a wider class of utility functions.
Findings
Algorithm effectively handles sums of exponentials utility functions.
Method extends to approximate any increasing utility function.
Complexity depends on the number of exponential terms.
Abstract
We provide a new algorithm for solving Risk Sensitive Partially Observable Markov Decisions Processes, when the risk is modeled by a utility function, and both the state space and the space of observations is finite. This algorithm is based on an observation that the change of measure and the subsequent introduction of the information space that is used for exponential utility functions, can be actually extended for sums of exponentials if one introduces an extra vector parameter that tracks the "expected accumulated cost" that corresponds to each exponential. Since every increasing function can be approximated by sums of exponentials in finite intervals, the method can be essentially applied for any utility function, with its complexity depending on the number of exponentials.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference
