Feature-Based Belief Aggregation for Partially Observable Markov Decision Problems
Yuchao Li, Kim Hammar, and Dimitri Bertsekas

TL;DR
This paper introduces a feature-based aggregation method for approximating cost functions in POMDPs, enabling more efficient solutions with theoretical error bounds and improved accuracy through biased aggregation.
Contribution
It proposes a novel two-stage aggregation approach for POMDPs using features and representative beliefs, enhancing dynamic programming applicability and providing error bounds.
Findings
Derived a new bound on approximation error.
Established conditions for lower bounds on the optimal cost.
Introduced a biased aggregation method to improve approximation quality.
Abstract
We consider a finite-state partially observable Markov decision problem (POMDP) with an infinite horizon and a discounted cost, and we propose a new method for computing a cost function approximation that is based on features and aggregation. In particular, using the classical belief-space formulation, we construct a related Markov decision problem (MDP) by first aggregating the unobservable states into feature states, and then introducing representative beliefs over these feature states. This two-stage aggregation approach facilitates the use of dynamic programming methods for solving the aggregate problem and provides additional design flexibility. The optimal cost function of the aggregate problem can in turn be used within an on-line approximation in value space scheme for the original POMDP. We derive a new bound on the approximation error of our scheme. In addition, we establish…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
