Model-Based Reinforcement Learning for Approximate Optimal Control with   Temporal Logic Specifications

Max Cohen; Calin Belta

arXiv:2101.07156·eess.SY·April 16, 2021

Model-Based Reinforcement Learning for Approximate Optimal Control with Temporal Logic Specifications

Max Cohen, Calin Belta

PDF

TL;DR

This paper presents a learning-based approach for synthesizing approximate optimal control policies for uncertain nonlinear systems to satisfy temporal logic specifications, ensuring correctness via barrier certificates and Lyapunov tools.

Contribution

It introduces a novel method combining model-based reinforcement learning with temporal logic constraints for continuous-time nonlinear systems.

Findings

01

Successfully synthesizes control policies satisfying scLTL specifications.

02

Provides conditions ensuring correctness of approximate solutions.

03

Demonstrates effectiveness through numerical example.

Abstract

In this paper we study the problem of synthesizing optimal control policies for uncertain continuous-time nonlinear systems from syntactically co-safe linear temporal logic (scLTL) formulas. We formulate this problem as a sequence of reach-avoid optimal control sub-problems. We show that the resulting hybrid optimal control policy guarantees the satisfaction of a given scLTL formula by constructing a barrier certificate. Since solving each optimal control problem may be computationally intractable, we take a learning-based approach to approximately solve this sequence of optimal control problems online without requiring full knowledge of the system dynamics. Using Lyapunov-based tools, we develop sufficient conditions under which our approximate solution maintains correctness. Finally, we demonstrate the efficacy of the developed method with a numerical example.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.