Regular Decision Processes for Grid Worlds

Nicky Lenaers; Martijn van Otterlo

arXiv:2111.03647·cs.AI·November 10, 2021

Regular Decision Processes for Grid Worlds

Nicky Lenaers, Martijn van Otterlo

PDF

TL;DR

This paper explores regular decision processes that handle non-Markovian rewards and transitions, providing a tool chain, algorithms, and experiments in grid worlds to advance decision-making under complex temporal dependencies.

Contribution

It introduces a comprehensive tool chain and algorithmic extensions for regular decision processes, enabling effective learning in non-Markovian grid world environments.

Findings

01

Model-free and model-based algorithms evaluated

02

Successful application to non-Markovian grid worlds

03

Enhanced decision-making under complex temporal dependencies

Abstract

Markov decision processes are typically used for sequential decision making under uncertainty. For many aspects however, ranging from constrained or safe specifications to various kinds of temporal (non-Markovian) dependencies in task and reward structures, extensions are needed. To that end, in recent years interest has grown into combinations of reinforcement learning and temporal logic, that is, combinations of flexible behavior learning methods with robust verification and guarantees. In this paper we describe an experimental investigation of the recently introduced regular decision processes that support both non-Markovian reward functions as well as transition functions. In particular, we provide a tool chain for regular decision processes, algorithmic extensions relating to online, incremental learning, an empirical evaluation of model-free and model-based solution algorithms,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.