Pushdown Reward Machines for Reinforcement Learning
Giovanni Varricchione, Toryn Q. Klassen, Natasha Alechina, Mehdi Dastani, Brian Logan, Sheila A. McIlraith

TL;DR
This paper introduces pushdown reward machines (pdRMs), an extension of reward machines using pushdown automata, enabling the encoding of more complex, context-free language-based behaviors in reinforcement learning, with theoretical and experimental validation.
Contribution
The work extends reward machines to pushdown automata, increasing expressiveness for representing complex behaviors in reinforcement learning, and provides theoretical analysis and practical algorithms.
Findings
pdRMs are more expressive than reward machines.
Theoretical bounds on policy equivalence with limited stack access.
Experimental results demonstrate successful training on context-free language tasks.
Abstract
Reward machines (RMs) are automata structures that encode (non-Markovian) reward functions for reinforcement learning (RL). RMs can reward any behaviour representable in regular languages and, when paired with RL algorithms that exploit RM structure, have been shown to significantly improve sample efficiency in many domains. In this work, we present pushdown reward machines (pdRMs), an extension of reward machines based on deterministic pushdown automata. pdRMs can recognise and reward temporally extended behaviours representable in deterministic context-free languages, making them more expressive than reward machines. We introduce two variants of pdRM-based policies, one which has access to the entire stack of the pdRM, and one which can only access the top symbols (for a given constant ) of the stack. We propose a procedure to check when the two kinds of policies (for a given…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Formal Methods in Verification · Reinforcement Learning in Robotics
