# Efficient and Safe Exploration in Deterministic Markov Decision   Processes with Unknown Transition Models

**Authors:** Erdem B{\i}y{\i}k, Jonathan Margoliash, Shahrouz Ryan Alimo, Dorsa, Sadigh

arXiv: 1904.01068 · 2020-06-05

## TL;DR

This paper introduces a deterministic safe exploration algorithm for Markov Decision Processes with unknown transitions, ensuring safety through Lipschitz continuity and optimizing exploration efficiency in navigation tasks.

## Contribution

It presents a novel safe exploration method that guarantees safety deterministically and reduces exploration steps in unknown deterministic MDPs.

## Key findings

- The algorithm guarantees safety during exploration.
- It reduces the number of actions needed compared to baselines.
- Performance demonstrated in navigation simulations.

## Abstract

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models. Our algorithm guarantees safety by leveraging Lipschitz-continuity to ensure that no unsafe states are visited during exploration. Unlike many other existing techniques, the provided safety guarantee is deterministic. Our algorithm is optimized to reduce the number of actions needed for exploring the safe space. We demonstrate the performance of our algorithm in comparison with baseline methods in simulation on navigation tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.01068/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1904.01068/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1904.01068/full.md

---
Source: https://tomesphere.com/paper/1904.01068