Shielding in Resource-Constrained Goal POMDPs

Michal Ajdar\'ow; \v{S}imon Brlej; Petr Novotn\'y

arXiv:2211.15349·cs.AI·November 29, 2022

Shielding in Resource-Constrained Goal POMDPs

Michal Ajdar\'ow, \v{S}imon Brlej, Petr Novotn\'y

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a method combining formal shielding techniques with POMCP heuristic search to effectively solve resource-constrained goal optimization problems in POMDPs, preventing resource exhaustion while minimizing costs.

Contribution

It presents a novel two-step approach: designing shields via formal methods and integrating them with POMCP for resource-aware POMDP planning.

Findings

01

The combined algorithm successfully prevents resource exhaustion in benchmark scenarios.

02

The approach improves planning efficiency in resource-constrained POMDPs.

03

Experimental results demonstrate the method's applicability and effectiveness.

Abstract

We consider partially observable Markov decision processes (POMDPs) modeling an agent that needs a supply of a certain resource (e.g., electricity stored in batteries) to operate correctly. The resource is consumed by agent's actions and can be replenished only in certain states. The agent aims to minimize the expected cost of reaching some goal while preventing resource exhaustion, a problem we call \emph{resource-constrained goal optimization} (RSGO). We take a two-step approach to the RSGO problem. First, using formal methods techniques, we design an algorithm computing a \emph{shield} for a given scenario: a procedure that observes the agent and prevents it from using actions that might eventually lead to resource exhaustion. Second, we augment the POMCP heuristic search algorithm for POMDP planning with our shields to obtain an algorithm solving the RSGO problem. We implement our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xbrlej/fipomdp
noneOfficial

Videos

Shielding in Resource-Constrained Goal POMDPs· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Bayesian Modeling and Causal Inference · Formal Methods in Verification