Parameter-Independent Strategies for pMDPs via POMDPs

Sebastian Arming; Ezio Bartocci; Krishnendu Chatterjee; Joost-Pieter; Katoen; Ana Sokolova

arXiv:1806.05126·cs.LO·June 14, 2018

Parameter-Independent Strategies for pMDPs via POMDPs

Sebastian Arming, Ezio Bartocci, Krishnendu Chatterjee, Joost-Pieter, Katoen, Ana Sokolova

PDF

TL;DR

This paper introduces a novel approach to compute parameter-independent, expectation-optimal strategies for parametric MDPs by reducing the problem to POMDPs, enabling handling of uncertainties in probabilistic systems.

Contribution

It presents the first method to compute expectation-optimal strategies for pMDPs with unknown parameters by encoding the problem as a POMDP.

Findings

01

Effective in various benchmarks including robot navigation and consensus protocols.

02

Outperforms existing methods in handling parameter uncertainties.

03

Demonstrates practical applicability through experimental evaluation.

Abstract

Markov Decision Processes (MDPs) are a popular class of models suitable for solving control decision problems in probabilistic reactive systems. We consider parametric MDPs (pMDPs) that include parameters in some of the transition probabilities to account for stochastic uncertainties of the environment such as noise or input disturbances. We study pMDPs with reachability objectives where the parameter values are unknown and impossible to measure directly during execution, but there is a probability distribution known over the parameter values. We study for the first time computing parameter-independent strategies that are expectation optimal, i.e., optimize the expected reachability probability under the probability distribution over the parameters. We present an encoding of our problem to partially observable MDPs (POMDPs), i.e., a reduction of our problem to computing optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.