What should be observed for optimal reward in POMDPs?

Alyzia-Maria Konsta; Alberto Lluch Lafuente; Christoph Matheja

arXiv:2405.10768·cs.AI·July 12, 2024

What should be observed for optimal reward in POMDPs?

Alyzia-Maria Konsta, Alberto Lluch Lafuente, Christoph Matheja

PDF

Open Access 1 Repo

TL;DR

This paper investigates how to optimally select sensors in POMDPs within a budget to control the expected reward, introducing the novel OOP problem and providing algorithms for its decidable cases.

Contribution

It formulates the optimal observability problem (OOP) in POMDPs, proves its undecidability in general, and offers algorithms for the decidable fragment based on MDP strategies and SMT.

Findings

01

OOP is undecidable in general

02

Decidable when restricting to positional strategies

03

Algorithms show promising results on typical POMDP examples

Abstract

Partially observable Markov Decision Processes (POMDPs) are a standard model for agents making decisions in uncertain environments. Most work on POMDPs focuses on synthesizing strategies based on the available capabilities. However, system designers can often control an agent's observation capabilities, e.g. by placing or selecting sensors. This raises the question of how one should select an agent's sensors cost-effectively such that it achieves the desired goals. In this paper, we study the novel optimal observability problem OOP: Given a POMDP M, how should one change M's observation capabilities within a fixed budget such that its (minimal) expected reward remains below a given threshold? We show that the problem is undecidable in general and decidable when considering positional strategies only. We present two algorithms for a decidable fragment of the OOP: one based on optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alyziakonsta/optimal-observability-problem
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Bayesian Modeling and Causal Inference