Agent-state based policies in POMDPs: Beyond belief-state MDPs

Amit Sinha; Aditya Mahajan

arXiv:2409.15703·eess.SY·October 1, 2024

Agent-state based policies in POMDPs: Beyond belief-state MDPs

Amit Sinha, Aditya Mahajan

PDF

Open Access

TL;DR

This paper unifies and analyzes various agent-state based policy approaches in POMDPs, extending beyond belief-state MDPs, and demonstrates how these ideas enhance learning algorithms like Q-learning and actor-critic methods.

Contribution

It provides a unified framework for agent-state based policies in POMDPs and explores their application to improve reinforcement learning algorithms.

Findings

01

Unified treatment of agent-state based policies

02

Development of approaches for optimal and approximate policies

03

Enhancement of Q-learning and actor-critic algorithms

Abstract

The traditional approach to POMDPs is to convert them into fully observed MDPs by considering a belief state as an information state. However, a belief-state based approach requires perfect knowledge of the system dynamics and is therefore not applicable in the learning setting where the system model is unknown. Various approaches to circumvent this limitation have been proposed in the literature. We present a unified treatment of some of these approaches by viewing them as models where the agent maintains a local recursively updateable agent state and chooses actions based on the agent state. We highlight the different classes of agent-state based policies and the various approaches that have been proposed in the literature to find good policies within each class. These include the designer's approach to find optimal non-stationary agent-state based policies, policy search approaches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications

MethodsQ-Learning