An Actor Critic Method for Free Terminal Time Optimal Control
Evan Burton, Tenavi Nakamura-Zimmerer, Qi Gong, Wei Kang

TL;DR
This paper introduces an actor-critic method adapted from reinforcement learning to effectively solve complex free terminal time optimal control problems with irregularities and multiple local optima.
Contribution
It proposes an exponential transformation within a model-based actor-critic framework to learn approximate feedback controls and value functions for challenging optimal control problems.
Findings
Successfully handles nonsmooth and discontinuous control laws
Addresses irregular value functions and local optima
Effective on prototypical challenging examples
Abstract
Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose an adaptation of the model-based actor-critic paradigm from the field of Reinforcement Learning via an exponential transformation to learn an approximate feedback control and value function pair. We demonstrate the algorithm's effectiveness on prototypical examples featuring each of the main pathological issues present in problems of this type.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control
