Value-Function Approximations for Partially Observable Markov Decision Processes
M. Hauskrecht

TL;DR
This paper reviews and develops approximation methods for solving POMDPs, balancing computational efficiency and accuracy, supported by experiments in agent navigation scenarios.
Contribution
It provides a comprehensive survey of existing approximation techniques and introduces new methods and refinements for POMDPs.
Findings
New approximation methods proposed.
Analysis of properties and relations of existing methods.
Experimental validation on agent navigation problem.
Abstract
Partially observable Markov decision processes (POMDPs) provide an elegant mathematical framework for modeling complex decision and planning problems in stochastic domains in which states of the system are observable only indirectly, via a set of imperfect or noisy observations. The modeling advantage of POMDPs, however, comes at a price -- exact methods for solving them are computationally very expensive and thus applicable in practice only to very simple problems. We focus on efficient approximation (heuristic) methods that attempt to alleviate the computational problem and trade off accuracy for speed. We have two objectives here. First, we survey various approximation methods, analyze their properties and relations and provide some new insights into their differences. Second, we present a number of new approximation methods and novel refinements of existing techniques. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
