Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-sum Objectives
Krishnendu Chatterjee, Adri\'an Elgy\"utt, Petr Novotn\'y, Owen, Rouill\'e

TL;DR
This paper introduces the Expectation Optimization with Probabilistic Guarantees (EOPG) problem in POMDPs with discounted-sum objectives, aiming to optimize expected payoff while ensuring a probabilistic threshold, and provides the first algorithm for solving it.
Contribution
It formulates the EOPG problem in POMDPs with discounted-sum payoffs and presents the first algorithm to solve this risk-aware optimization problem.
Findings
First algorithm for EOPG in POMDPs with discounted-sum objectives.
Addresses risk-averse policy optimization balancing expectation and probabilistic guarantees.
Enhances decision-making models under uncertainty with probabilistic constraints.
Abstract
Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk-averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows obtaining risk-averse policies, but ignores optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem, where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
