Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives

Milad Kazemi; Mateo Perez; Fabio Somenzi; Sadegh Soudjani; Ashutosh Trivedi; Alvaro Velasquez

arXiv:2505.15693·cs.AI·March 23, 2026

Average Reward Reinforcement Learning for Omega-Regular and Mean-Payoff Objectives

Milad Kazemi, Mateo Perez, Fabio Somenzi, Sadegh Soudjani, Ashutosh Trivedi, Alvaro Velasquez

PDF

TL;DR

This paper introduces a novel model-free reinforcement learning framework that optimizes absolute liveness specifications expressed as omega-regular languages using average-reward objectives, suitable for ongoing tasks without resets.

Contribution

It is the first to translate absolute liveness omega-regular specifications into average-reward objectives for model-free RL in unknown MDPs, supporting continuous interaction.

Findings

01

Outperforms discount-based methods in benchmarks

02

Guarantees convergence in unknown communicating MDPs

03

Supports on-the-fly environment reductions

Abstract

Recent advances in reinforcement learning (RL) have renewed interest in reward design for shaping agent behavior, but manually crafting reward functions is tedious and error-prone. A principled alternative is to specify behavioral requirements in a formal, unambiguous language and automatically compile them into learning objectives. $ω$ -regular languages are a natural fit, given their role in formal verification and synthesis. However, most existing $ω$ -regular RL approaches operate in an episodic, discounted setting with periodic resets, which is misaligned with $ω$ -regular semantics over infinite traces. For continuing tasks, where the agent interacts with the environment over a single uninterrupted lifetime, the average-reward criterion is more appropriate. We focus on absolute liveness specifications, a subclass of $ω$ -regular languages that cannot be violated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsFocus