Finding Safe Zones of policies Markov Decision Processes

Lee Cohen; Yishay Mansour; Michal Moshkovitz

arXiv:2202.11593·cs.LG·October 10, 2023

Finding Safe Zones of policies Markov Decision Processes

Lee Cohen, Yishay Mansour, Michal Moshkovitz

PDF

Open Access

TL;DR

This paper introduces a method to identify SafeZones in Markov Decision Processes, which are subsets of states where policies tend to stay, balancing size and escape probability, with proven computational hardness and an approximation algorithm.

Contribution

It formalizes the concept of SafeZones, analyzes their computational complexity, and proposes a bi-criteria approximation learning algorithm with provable guarantees.

Findings

01

SafeZones can be characterized by size and escape probability.

02

Finding optimal SafeZones is computationally hard.

03

A polynomial sample complexity algorithm achieves near-2 approximation.

Abstract

Given a policy of a Markov Decision Process, we define a SafeZone as a subset of states, such that most of the policy's trajectories are confined to this subset. The quality of a SafeZone is parameterized by the number of states and the escape probability, i.e., the probability that a random trajectory will leave the subset. SafeZones are especially interesting when they have a small number of states and low escape probability. We study the complexity of finding optimal SafeZones, and show that in general, the problem is computationally hard. Our main result is a bi-criteria approximation learning algorithm with a factor of almost $2$ approximation for both the escape probability and SafeZone size, using a polynomial size sample complexity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Reinforcement Learning in Robotics