LTL-Constrained Steady-State Policy Synthesis

Jan K\v{r}et\'insk\'y

arXiv:2105.14894·cs.AI·June 1, 2021

LTL-Constrained Steady-State Policy Synthesis

Jan K\v{r}et\'insk\'y

PDF

TL;DR

This paper introduces a unified approach for synthesizing policies in Markov decision processes that satisfy LTL specifications with steady-state constraints while maximizing rewards, using automata-based reduction and linear programming.

Contribution

It presents a novel method combining LTL, steady-state constraints, and rewards into a multi-dimensional long-run average reward framework using Limit-Deterministic B"uchi Automata.

Findings

01

Algorithm runs in polynomial time.

02

Extends to general ω-regular properties.

03

Provides a unified solution for multi-type specifications.

Abstract

Decision-making policies for agents are often synthesized with the constraint that a formal specification of behaviour is satisfied. Here we focus on infinite-horizon properties. On the one hand, Linear Temporal Logic (LTL) is a popular example of a formalism for qualitative specifications. On the other hand, Steady-State Policy Synthesis (SSPS) has recently received considerable attention as it provides a more quantitative and more behavioural perspective on specifications, in terms of the frequency with which states are visited. Finally, rewards provide a classic framework for quantitative properties. In this paper, we study Markov decision processes (MDP) with the specification combining all these three types. The derived policy maximizes the reward among all policies ensuring the LTL specification with the given probability and adhering to the steady-state constraints. To this end,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.