Refined Bounds on Near Optimality Finite Window Policies in POMDPs and   Their Reinforcement Learning

Yunus Emre Demirci; Ali Devran Kara; Serdar Y\"uksel

arXiv:2409.04351·math.OC·September 9, 2024

Refined Bounds on Near Optimality Finite Window Policies in POMDPs and Their Reinforcement Learning

Yunus Emre Demirci, Ali Devran Kara, Serdar Y\"uksel

PDF

Open Access

TL;DR

This paper refines theoretical bounds on near-optimal finite window policies in POMDPs, extending previous results to Wasserstein distance and providing stronger, more relaxed error bounds for reinforcement learning applications.

Contribution

It extends existing bounds on POMDP policies to Wasserstein distance and offers more relaxed, stronger error bounds for reinforcement learning in partially observable environments.

Findings

01

Refined bounds using Wasserstein distance for POMDP policies.

02

Established uniform filter stability in expected Wasserstein distance.

03

Provided explicit examples demonstrating improved bounds.

Abstract

Finding optimal policies for Partially Observable Markov Decision Processes (POMDPs) is challenging due to their uncountable state spaces when transformed into fully observable Markov Decision Processes (MDPs) using belief states. Traditional methods such as dynamic programming or policy iteration are difficult to apply in this context, necessitating the use of approximation methods on belief states or other techniques. Recently, in (Journal of Machine Learning Research, vol. 23, pp. 1-46, 2022) and (Mathematics of Operations Research, vol. 48, pp. 2066-2093, Nov. 2023), it was shown that sliding finite window based policies are near-optimal for POMDPs with standard Borel valued hidden state spaces, and can be learned via reinforcement learning, with error bounds explicitly dependent on a uniform filter stability term involving total variation in expectation and sample path-wise,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Optical Network Technologies