Expressive Temporal Specifications for Reward Monitoring

Omar Adalat; Francesco Belardinelli

arXiv:2511.12808·cs.LG·December 30, 2025

Expressive Temporal Specifications for Reward Monitoring

Omar Adalat, Francesco Belardinelli

PDF

Open Access

TL;DR

This paper introduces a method using quantitative Linear Temporal Logic to create reward monitors that provide dense, nuanced feedback during reinforcement learning, improving training efficiency and task performance.

Contribution

It proposes a novel, logic-based framework for synthesizing reward monitors that handle non-Markovian properties and dense reward signals, surpassing Boolean monitors.

Findings

01

Quantitative monitors outperform Boolean monitors in task completion.

02

They reduce convergence time in reinforcement learning.

03

The approach is compatible with various algorithms and environments.

Abstract

Specifying informative and dense reward functions remains a pivotal challenge in Reinforcement Learning, as it directly affects the efficiency of agent training. In this work, we harness the expressive power of quantitative Linear Temporal Logic on finite traces (( $LTL_{f} [F]$ )) to synthesize reward monitors that generate a dense stream of rewards for runtime-observable state trajectories. By providing nuanced feedback during training, these monitors guide agents toward optimal behaviour and help mitigate the well-known issue of sparse rewards under long-horizon decision making, which arises under the Boolean semantics dominating the current literature. Our framework is algorithm-agnostic and only relies on a state labelling function, and naturally accommodates specifying non-Markovian properties. Empirical results show that our quantitative monitors consistently subsume…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Advanced Software Engineering Methodologies