LTL-Constrained Steady-State Policy Synthesis
Jan K\v{r}et\'insk\'y

TL;DR
This paper introduces a unified approach for synthesizing policies in Markov decision processes that satisfy LTL specifications with steady-state constraints while maximizing rewards, using automata-based reduction and linear programming.
Contribution
It presents a novel method combining LTL, steady-state constraints, and rewards into a multi-dimensional long-run average reward framework using Limit-Deterministic B"uchi Automata.
Findings
Algorithm runs in polynomial time.
Extends to general ω-regular properties.
Provides a unified solution for multi-type specifications.
Abstract
Decision-making policies for agents are often synthesized with the constraint that a formal specification of behaviour is satisfied. Here we focus on infinite-horizon properties. On the one hand, Linear Temporal Logic (LTL) is a popular example of a formalism for qualitative specifications. On the other hand, Steady-State Policy Synthesis (SSPS) has recently received considerable attention as it provides a more quantitative and more behavioural perspective on specifications, in terms of the frequency with which states are visited. Finally, rewards provide a classic framework for quantitative properties. In this paper, we study Markov decision processes (MDP) with the specification combining all these three types. The derived policy maximizes the reward among all policies ensuring the LTL specification with the given probability and adhering to the steady-state constraints. To this end,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
