Near Optimal Approximations and Finite Memory Policies for POMPDs with Continuous Spaces
Ali Devran Kara, Erhan Bayraktar, Serdar Yuksel

TL;DR
This paper introduces a rigorous approximation method for continuous-space POMDPs by discretizing observations and constructing finite MDPs, enabling near-optimal policies and a convergent Q-learning algorithm with finite memory.
Contribution
It generalizes recent work to provide a practical discretization approach for POMDPs with continuous spaces, ensuring near-optimality and convergence of learning algorithms.
Findings
Discretizing observations yields near-optimal policies under regularity assumptions.
The proposed Q-learning algorithm converges to the optimality equation using finite memory.
Refined filter stability conditions improve approximation accuracy.
Abstract
We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications, due to the state space being lifted to the space of probability measures. Generalizing recent work, in this paper we present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is near-optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. Furthermore, by quantizing the measurements, we are able to utilize refined filter stability conditions. We also provide a Q…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Optimization and Packing Problems
