Hierarchies of Reward Machines
Daniel Furelos-Blanco, Mark Law, Anders Jonsson, Krysia Broda,, Alessandra Russo

TL;DR
This paper introduces Hierarchies of Reward Machines (HRMs), a hierarchical extension of reward machines that improves learning efficiency in reinforcement learning by decomposing complex tasks into subtasks and enabling curriculum learning.
Contribution
The paper formalizes HRMs by allowing reward machines to call other RMs, and presents a curriculum-based method for learning HRMs from observed traces.
Findings
Hierarchical reward machines enable faster convergence in RL tasks.
Learning HRMs is feasible when flat representations are not.
Exploiting HRMs improves task decomposition and learning efficiency.
Abstract
Reward machines (RMs) are a recent formalism for representing the reward function of a reinforcement learning task through a finite-state machine whose edges encode subgoals of the task using high-level events. The structure of RMs enables the decomposition of a task into simpler and independently solvable subtasks that help tackle long-horizon and/or sparse reward tasks. We propose a formalism for further abstracting the subtask structure by endowing an RM with the ability to call other RMs, thus composing a hierarchy of RMs (HRM). We exploit HRMs by treating each call to an RM as an independently solvable subtask using the options framework, and describe a curriculum-based method to learn HRMs from traces observed by the agent. Our experiments reveal that exploiting a handcrafted HRM leads to faster convergence than with a flat HRM, and that learning an HRM is feasible in cases where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Algorithms · Receptor Mechanisms and Signaling
