Alternating Good-for-MDP Automata

Ernst Moritz Hahn; Mateo Perez; Sven Schewe; Fabio Somenzi; and Ashutosh Trivedi; Dominik Wojtczak

arXiv:2205.03243·cs.FL·May 9, 2022

Alternating Good-for-MDP Automata

Ernst Moritz Hahn, Mateo Perez, Sven Schewe, Fabio Somenzi, and Ashutosh Trivedi, Dominik Wojtczak

PDF

Open Access

TL;DR

This paper introduces alternating good-for-MDP automata, which are more succinct than nondeterministic B"uchi automata and can be efficiently derived from deterministic Streett automata, improving automata translation for reinforcement learning.

Contribution

It extends the good-for-MDP automata concept to alternating automata, enabling more efficient translations from deterministic Streett automata with exponential size reductions.

Findings

01

Alternating good-for-MDP automata are bi-linear in size, similar to nondeterministic GFM automata.

02

They can be exponentially more succinct than minimal nondeterministic B"uchi automata.

03

The approach reduces translation complexity from deterministic Streett automata.

Abstract

When omega-regular objectives were first proposed in model-free reinforcement learning (RL) for controlling MDPs, deterministic Rabin automata were used in an attempt to provide a direct translation from their transitions to scalar values. While these translations failed, it has turned out that it is possible to repair them by using good-for-MDPs (GFM) B\"uchi automata instead. These are nondeterministic B\"uchi automata with a restricted type of nondeterminism, albeit not as restricted as in good-for-games automata. Indeed, deterministic Rabin automata have a pretty straightforward translation to such GFM automata, which is bi-linear in the number of states and pairs. Interestingly, the same cannot be said for deterministic Streett automata: a translation to nondeterministic Rabin or B\"uchi automata comes at an exponential cost, even without requiring the target automaton to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicssemigroups and automata theory · Formal Methods in Verification · Model-Driven Software Engineering Techniques

MethodsRepair