Optimizing Expectation with Guarantees in POMDPs (Technical Report)
Krishnendu Chatterjee, Petr Novotn\'y, Guillermo A. P\'erez,, Jean-Fran\c{c}ois Raskin, {\DJ}or{\dj}e \v{Z}ikeli\'c

TL;DR
This paper introduces a new approach for POMDPs that guarantees a minimum payoff for all outcomes while optimizing the expected payoff, addressing safety and performance in uncertain decision-making.
Contribution
It proposes the first practical method for guaranteed payoff optimization in POMDPs, balancing safety constraints with expected reward maximization.
Findings
Method effectively ensures minimum payoff guarantees.
Approach outperforms existing threshold-based methods.
Evaluations on standard benchmarks demonstrate practical viability.
Abstract
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probability to ensure that the payoff is at least a given threshold, but these approaches do not consider any optimization beyond satisfying this threshold constraint. In this work we go beyond both the "expectation" and "threshold" approaches and consider a "guaranteed payoff optimization (GPO)" problem for POMDPs, where we are given a threshold and the objective is to find a policy such that a) each possible outcome of yields a discounted-sum payoff of at least , and b) the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Petri Nets in System Modeling · Real-Time Systems Scheduling
