Anytime Point-Based Approximations for Large POMDPs
J. Pineau, G. Gordon, S. Thrun

TL;DR
This paper introduces PBVI, an anytime algorithm for large POMDPs that uses novel belief point selection techniques to improve efficiency, with theoretical justification and empirical comparison to existing methods.
Contribution
It presents a new belief point selection method combined with point-based backups, forming the PBVI algorithm for efficient large-scale POMDP solving.
Findings
PBVI outperforms some existing methods in large POMDPs.
The belief point selection technique improves solution quality.
Empirical results on robotic tasks demonstrate practical effectiveness.
Abstract
The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A well-known technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
