An Actor Critic Method for Free Terminal Time Optimal Control

Evan Burton; Tenavi Nakamura-Zimmerer; Qi Gong; Wei Kang

arXiv:2208.00065·math.OC·August 8, 2022

An Actor Critic Method for Free Terminal Time Optimal Control

Evan Burton, Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

PDF

Open Access

TL;DR

This paper introduces an actor-critic method adapted from reinforcement learning to effectively solve complex free terminal time optimal control problems with irregularities and multiple local optima.

Contribution

It proposes an exponential transformation within a model-based actor-critic framework to learn approximate feedback controls and value functions for challenging optimal control problems.

Findings

01

Successfully handles nonsmooth and discontinuous control laws

02

Addresses irregular value functions and local optima

03

Effective on prototypical challenging examples

Abstract

Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose an adaptation of the model-based actor-critic paradigm from the field of Reinforcement Learning via an exponential transformation to learn an approximate feedback control and value function pair. We demonstrate the algorithm's effectiveness on prototypical examples featuring each of the main pathological issues present in problems of this type.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control