Tighter Value-Function Approximations for POMDPs

Merlijn Krale; Wietze Koops; Sebastian Junges; Thiago D.; Sim\~ao; Nils Jansen

arXiv:2502.06523·cs.AI·February 11, 2025

Tighter Value-Function Approximations for POMDPs

Merlijn Krale, Wietze Koops, Sebastian Junges, Thiago D., Sim\~ao, Nils Jansen

PDF

Open Access

TL;DR

This paper introduces new tighter upper value bounds for POMDPs that improve solver performance despite higher computational costs, advancing practical decision-making under uncertainty.

Contribution

The paper proposes a novel class of upper value bounds for POMDPs that are provably tighter than existing bounds, enhancing solver efficiency.

Findings

01

New bounds are empirically tighter than fast informed bounds.

02

Tighter bounds lead to faster POMDP solving on benchmark problems.

03

Additional computational overhead is justified by performance gains.

Abstract

Solving partially observable Markov decision processes (POMDPs) typically requires reasoning about the values of exponentially many state beliefs. Towards practical performance, state-of-the-art solvers use value bounds to guide this reasoning. However, sound upper value bounds are often computationally expensive to compute, and there is a tradeoff between the tightness of such bounds and their computational cost. This paper introduces new and provably tighter upper value bounds than the commonly used fast informed bound. Our empirical evaluation shows that, despite their additional computational overhead, the new upper bounds accelerate state-of-the-art POMDP solvers on a wide range of benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical Dynamics and Fractals