Near Optimal Approximations and Finite Memory Policies for POMPDs with   Continuous Spaces

Ali Devran Kara; Erhan Bayraktar; Serdar Yuksel

arXiv:2410.02895·math.OC·January 20, 2025

Near Optimal Approximations and Finite Memory Policies for POMPDs with Continuous Spaces

Ali Devran Kara, Erhan Bayraktar, Serdar Yuksel

PDF

Open Access

TL;DR

This paper introduces a rigorous approximation method for continuous-space POMDPs by discretizing observations and constructing finite MDPs, enabling near-optimal policies and a convergent Q-learning algorithm with finite memory.

Contribution

It generalizes recent work to provide a practical discretization approach for POMDPs with continuous spaces, ensuring near-optimality and convergence of learning algorithms.

Findings

01

Discretizing observations yields near-optimal policies under regularity assumptions.

02

The proposed Q-learning algorithm converges to the optimality equation using finite memory.

03

Refined filter stability conditions improve approximation accuracy.

Abstract

We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applications, due to the state space being lifted to the space of probability measures. Generalizing recent work, in this paper we present rigorous approximation methods via discretizing the observation space and constructing a fully observed finite MDP model using a finite length history of the discrete observations and control actions. We show that the resulting policy is near-optimal under some regularity assumptions on the channel, and under certain controlled filter stability requirements for the hidden state process. Furthermore, by quantizing the measurements, we are able to utilize refined filter stability conditions. We also provide a Q…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optimization Algorithms Research · Optimization and Packing Problems