Reward Machines: Exploiting Reward Function Structure in Reinforcement   Learning

Rodrigo Toro Icarte; Toryn Q. Klassen; Richard Valenzano; Sheila A.; McIlraith

arXiv:2010.03950·cs.LG·January 19, 2022

Reward Machines: Exploiting Reward Function Structure in Reinforcement Learning

Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila A., McIlraith

PDF

3 Repos

TL;DR

This paper introduces reward machines, a structured way to represent and exploit reward functions in reinforcement learning, leading to more sample-efficient learning and better policies by leveraging internal reward structure.

Contribution

The paper proposes reward machines, a novel finite state machine framework for explicitly representing reward functions, enabling structured learning and reward shaping in RL.

Findings

01

Improved sample efficiency across multiple domains

02

Enhanced policy quality through reward structure exploitation

03

Supports complex reward specifications like temporal logic

Abstract

Reinforcement learning (RL) methods usually treat reward functions as black boxes. As such, these methods must extensively interact with the environment in order to discover rewards and optimal policies. In most RL applications, however, users have to program the reward function and, hence, there is the opportunity to make the reward function visible -- to show the reward function's code to the RL agent so it can exploit the function's internal structure to learn optimal policies in a more sample efficient manner. In this paper, we show how to accomplish this idea in two steps. First, we propose reward machines, a type of finite state machine that supports the specification of reward functions while exposing reward function structure. We then describe different methodologies to exploit this structure to support learning, including automated reward shaping, task decomposition, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.