On Tackling Complex Tasks with Reward Machines and Signal Temporal Logics

Ana Mar\'ia G\'omez Ruiz (UGA); Thao Dang (VERIMAG - IMAG; CNRS; UGA); Alexandre Donz\'e

arXiv:2604.14440·cs.AI·April 17, 2026

On Tackling Complex Tasks with Reward Machines and Signal Temporal Logics

Ana Mar\'ia G\'omez Ruiz (UGA), Thao Dang (VERIMAG - IMAG, CNRS, UGA), Alexandre Donz\'e

PDF

TL;DR

This paper introduces a reinforcement learning framework that combines Reward Machines and Signal Temporal Logic to efficiently handle complex tasks and guide training towards satisfying specified behaviors.

Contribution

It extends Reward Machines with STL formulas for better reward representation and uses STL online monitoring to improve RL training for complex tasks.

Findings

01

Framework successfully applied to minigrid, cart-pole, and highway environments.

02

STL-guided reward shaping improves training efficiency.

03

Case studies demonstrate handling of non-trivial tasks.

Abstract

We propose a Reinforcement Learning (RL) based control design framework for handling complex tasks. The approach extends the concept of Reward Machines (RM) with Signal Temporal Logic (STL) formulas that can be used for event generation. The use of STL allows not only a more efficient representation of rewards for complex tasks but also guiding the training process to converge towards behaviors satisfying specified requirements. We also propose an implementation of the framework that leverages the STL online monitoring algorithms. We illustrate the framework with three case studies (minigrid, cart-pole and high-way environments) with non-trivial tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.