Model-Based Reinforcement Learning for Approximate Optimal Control with Temporal Logic Specifications
Max Cohen, Calin Belta

TL;DR
This paper presents a learning-based approach for synthesizing approximate optimal control policies for uncertain nonlinear systems to satisfy temporal logic specifications, ensuring correctness via barrier certificates and Lyapunov tools.
Contribution
It introduces a novel method combining model-based reinforcement learning with temporal logic constraints for continuous-time nonlinear systems.
Findings
Successfully synthesizes control policies satisfying scLTL specifications.
Provides conditions ensuring correctness of approximate solutions.
Demonstrates effectiveness through numerical example.
Abstract
In this paper we study the problem of synthesizing optimal control policies for uncertain continuous-time nonlinear systems from syntactically co-safe linear temporal logic (scLTL) formulas. We formulate this problem as a sequence of reach-avoid optimal control sub-problems. We show that the resulting hybrid optimal control policy guarantees the satisfaction of a given scLTL formula by constructing a barrier certificate. Since solving each optimal control problem may be computationally intractable, we take a learning-based approach to approximately solve this sequence of optimal control problems online without requiring full knowledge of the system dynamics. Using Lyapunov-based tools, we develop sufficient conditions under which our approximate solution maintains correctness. Finally, we demonstrate the efficacy of the developed method with a numerical example.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
