Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives
Qi Heng Ho, Martin S. Feather, Federico Rossi, Zachary N., Sunberg, Morteza Lahijanian

TL;DR
This paper introduces a novel heuristic search value iteration algorithm for undiscounted POMDPs with reachability objectives, improving efficiency and policy quality by leveraging bounds and informed exploration.
Contribution
It extends point-based methods to indefinite-horizon reachability problems in POMDPs, providing a new algorithm with convergence guarantees and superior performance.
Findings
Outperforms existing methods in probability guarantees
Reduces computation time significantly
Provides policies with two-sided bounds on reachability probabilities
Abstract
Partially Observable Markov Decision Processes (POMDPs) are powerful models for sequential decision making under transition and observation uncertainties. This paper studies the challenging yet important problem in POMDPs known as the (indefinite-horizon) Maximal Reachability Probability Problem (MRPP), where the goal is to maximize the probability of reaching some target states. This is also a core problem in model checking with logical specifications and is naturally undiscounted (discount factor is one). Inspired by the success of point-based methods developed for discounted problems, we study their extensions to MRPP. Specifically, we focus on trial-based heuristic search value iteration techniques and present a novel algorithm that leverages the strengths of these techniques for efficient exploration of the belief space (informed search via value bounds) while addressing their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Packing Problems · Supply Chain and Inventory Management · Vehicle Routing Optimization Methods
MethodsFocus
