On Tackling Complex Tasks with Reward Machines and Signal Temporal Logics
Ana Mar\'ia G\'omez Ruiz (UGA), Thao Dang (VERIMAG - IMAG, CNRS, UGA), Alexandre Donz\'e

TL;DR
This paper introduces a reinforcement learning framework that combines Reward Machines and Signal Temporal Logic to efficiently handle complex tasks and guide training towards satisfying specified behaviors.
Contribution
It extends Reward Machines with STL formulas for better reward representation and uses STL online monitoring to improve RL training for complex tasks.
Findings
Framework successfully applied to minigrid, cart-pole, and highway environments.
STL-guided reward shaping improves training efficiency.
Case studies demonstrate handling of non-trivial tasks.
Abstract
We propose a Reinforcement Learning (RL) based control design framework for handling complex tasks. The approach extends the concept of Reward Machines (RM) with Signal Temporal Logic (STL) formulas that can be used for event generation. The use of STL allows not only a more efficient representation of rewards for complex tasks but also guiding the training process to converge towards behaviors satisfying specified requirements. We also propose an implementation of the framework that leverages the STL online monitoring algorithms. We illustrate the framework with three case studies (minigrid, cart-pole and high-way environments) with non-trivial tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
