Approximating Euclidean by Imprecise Markov Decision Processes

Manfred Jaeger; Giorgio Bacci; Giovanni Bacci; Kim Guldstrand Larsen,; and Peter Gj{\o}l Jensen

arXiv:2006.14923·cs.AI·June 29, 2020

Approximating Euclidean by Imprecise Markov Decision Processes

Manfred Jaeger, Giorgio Bacci, Giovanni Bacci, Kim Guldstrand Larsen,, and Peter Gj{\o}l Jensen

PDF

TL;DR

This paper explores how finite state imprecise Markov decision processes can approximate Euclidean MDPs, providing guarantees on approximation accuracy and aiding in validation of reinforcement learning strategies.

Contribution

It introduces theoretical bounds for approximating Euclidean MDPs with finite state models and demonstrates their use in validating and analyzing reinforcement learning outcomes.

Findings

01

Finite approximations become arbitrarily precise with finer partitions.

02

Theoretical results validate certain reinforcement learning design choices.

03

Imprecise MDPs reveal inaccuracies in learned cost functions.

Abstract

Euclidean Markov decision processes are a powerful tool for modeling control problems under uncertainty over continuous domains. Finite state imprecise, Markov decision processes can be used to approximate the behavior of these infinite models. In this paper we address two questions: first, we investigate what kind of approximation guarantees are obtained when the Euclidean process is approximated by finite state approximations induced by increasingly fine partitions of the continuous state space. We show that for cost functions over finite time horizons the approximations become arbitrarily precise. Second, we use imprecise Markov decision process approximations as a tool to analyse and validate cost functions and strategies obtained by reinforcement learning. We find that, on the one hand, our new theoretical results validate basic design choices of a previously proposed reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.