Tighter Value-Function Approximations for POMDPs
Merlijn Krale, Wietze Koops, Sebastian Junges, Thiago D., Sim\~ao, Nils Jansen

TL;DR
This paper introduces new tighter upper value bounds for POMDPs that improve solver performance despite higher computational costs, advancing practical decision-making under uncertainty.
Contribution
The paper proposes a novel class of upper value bounds for POMDPs that are provably tighter than existing bounds, enhancing solver efficiency.
Findings
New bounds are empirically tighter than fast informed bounds.
Tighter bounds lead to faster POMDP solving on benchmark problems.
Additional computational overhead is justified by performance gains.
Abstract
Solving partially observable Markov decision processes (POMDPs) typically requires reasoning about the values of exponentially many state beliefs. Towards practical performance, state-of-the-art solvers use value bounds to guide this reasoning. However, sound upper value bounds are often computationally expensive to compute, and there is a tradeoff between the tightness of such bounds and their computational cost. This paper introduces new and provably tighter upper value bounds than the commonly used fast informed bound. Our empirical evaluation shows that, despite their additional computational overhead, the new upper bounds accelerate state-of-the-art POMDP solvers on a wide range of benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Dynamics and Fractals
