Reinforcement Learning with Symbolic Reward Machines
Thomas Krug, Daniel Neider

TL;DR
This paper introduces Symbolic Reward Machines (SRMs) that enable reinforcement learning to utilize only environment observations, removing the need for manual labeling functions, and demonstrating improved performance and interpretability.
Contribution
The paper proposes SRMs and learning algorithms that eliminate manual labeling in reward machines, making RL more applicable and interpretable with standard environment outputs.
Findings
SRMs outperform baseline RL methods in experiments.
SRMs match the results of existing Reward Machines.
SRMs provide interpretable task representations.
Abstract
Reward Machines (RMs) are an established mechanism in Reinforcement Learning (RL) to represent and learn sparse, temporally extended tasks with non-Markovian rewards. RMs rely on high-level information in the form of labels that are emitted by the environment alongside the observation. However, this concept requires manual user input for each environment and task. The user has to create a suitable labeling function that computes the labels. These limitations lead to poor applicability in widely adopted RL frameworks. We propose Symbolic Reward Machines (SRMs) together with the learning algorithms QSRM and LSRM to overcome the limitations of RMs. SRMs consume only the standard output of the environment and process the observation directly through guards that are represented by symbolic formulas. In our evaluation, our SRM methods outperform the baseline RL approaches and generate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Emotion and Mood Recognition · Machine Learning and Algorithms
