Reachability and safety objectives in Markov decision processes on long   but finite horizons

Galit Ashkenazi-Golan; J\'anos Flesch; Arkadi Predtetchinski; Eilon; Solan

arXiv:1911.05578·math.OC·November 14, 2019·J. Optim. Theory Appl.

Reachability and safety objectives in Markov decision processes on long but finite horizons

Galit Ashkenazi-Golan, J\'anos Flesch, Arkadi Predtetchinski, Eilon, Solan

PDF

TL;DR

This paper investigates optimal strategies for reachability and safety objectives in finite-horizon Markov decision processes, establishing existence conditions for overtaking optimal strategies and extending results to two-player games.

Contribution

It introduces conditions for the existence of overtaking optimal strategies in finite-horizon MDPs and extends the analysis to safety objectives and two-player games.

Findings

01

Existence of pure stationary strategies under certain conditions.

02

Overtaking optimal strategies may not always exist.

03

Results extend to safety objectives and two-player zero-sum games.

Abstract

We consider discrete-time Markov decision processes in which the decision maker is interested in long but finite horizons. First we consider reachability objective: the decision maker's goal is to reach a specific target state with the highest possible probability. Formally, strategy $σ$ overtakes another strategy $σ^{'}$ , if the probability of reaching the target state within horizon $t$ is larger under $σ$ than under $σ^{'}$ , for all sufficiently large $t \in \NN$ . We prove that there exists a pure stationary strategy that is not overtaken by any pure strategy nor by any stationary strategy, under some condition on the transition structure and respectively under genericity. A strategy that is not overtaken by any other strategy, called an overtaking optimal strategy, does not always exist. We provide sufficient conditions for its existence. Next we consider safety…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.