A Learning Based Approach to Control Synthesis of Markov Decision   Processes for Linear Temporal Logic Specifications

Dorsa Sadigh; Eric S. Kim; Samuel Coogan; S. Shankar Sastry; Sanjit A.; Seshia

arXiv:1409.5486·cs.SY·September 22, 2014

A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications

Dorsa Sadigh, Eric S. Kim, Samuel Coogan, S. Shankar Sastry, Sanjit A., Seshia

PDF

Open Access

TL;DR

This paper introduces a learning-based method for synthesizing control policies for Markov decision processes to satisfy linear temporal logic specifications, even with unknown transition probabilities, ensuring correctness with probability one.

Contribution

It presents a novel approach combining automata theory and learning techniques to synthesize controllers for MDPs with LTL constraints, guaranteeing satisfaction if possible.

Findings

01

Method guarantees satisfaction of LTL properties with probability one if such policies exist.

02

Empirical case study in traffic control demonstrates practical effectiveness.

03

Approach can produce reasonable control strategies even when perfect satisfaction isn't achievable.

Abstract

We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated from the desired LTL property. The reward function of the product MDP is defined from the acceptance condition of the Rabin automaton. This construction allows us to apply techniques from learning theory to the problem of synthesis for LTL specifications even when the transition probabilities are not known a priori. We prove that our method is guaranteed to find a controller that satisfies the LTL property with probability one if such a policy exists, and we suggest empirically with a case study in traffic control that our method produces reasonable control strategies even when the LTL property cannot be satisfied with probability one.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFormal Methods in Verification · Advanced Software Engineering Methodologies · Petri Nets in System Modeling