MDP Optimal Control under Temporal Logic Constraints
Xu Chu Ding, Stephen L. Smith, Calin Belta, Daniela Rus

TL;DR
This paper presents a method to automatically generate control policies for MDPs that satisfy complex temporal logic specifications while optimizing a cost criterion, applicable to persistent robotic tasks.
Contribution
It introduces a novel approach combining LTL specifications with cost optimization in MDP control synthesis, including a dynamic programming algorithm for near-optimal policies.
Findings
Synthesizes control policies satisfying LTL specifications almost surely.
Develops a dynamic programming algorithm for optimal or near-optimal control.
Applicable to persistent robotic tasks like monitoring and data gathering.
Abstract
In this paper, we develop a method to automatically generate a control policy for a dynamical system modeled as a Markov Decision Process (MDP). The control specification is given as a Linear Temporal Logic (LTL) formula over a set of propositions defined on the states of the MDP. We synthesize a control policy such that the MDP satisfies the given specification almost surely, if such a policy exists. In addition, we designate an "optimizing proposition" to be repeatedly satisfied, and we formulate a novel optimization criterion in terms of minimizing the expected cost in between satisfactions of this proposition. We propose a sufficient condition for a policy to be optimal, and develop a dynamic programming algorithm that synthesizes a policy that is optimal under some conditions, and sub-optimal otherwise. This problem is motivated by robotic applications requiring persistent tasks,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification
