Offline Supervised Learning V.S. Online Direct Policy Optimization: A   Comparative Study and A Unified Training Paradigm for Neural Network-Based   Optimal Feedback Control

Yue Zhao; Jiequn Han

arXiv:2211.15930·math.OC·April 10, 2024·1 cites

Offline Supervised Learning V.S. Online Direct Policy Optimization: A Comparative Study and A Unified Training Paradigm for Neural Network-Based Optimal Feedback Control

Yue Zhao, Jiequn Han

PDF

Open Access 1 Repo

TL;DR

This paper compares offline supervised learning and online direct policy optimization for neural network-based feedback controllers, highlighting their strengths and weaknesses, and proposes a unified pre-train and fine-tune paradigm to enhance performance and robustness.

Contribution

It provides a comprehensive comparison of two prevalent control training methods and introduces a unified training paradigm to improve neural network-based optimal feedback control.

Findings

01

Offline supervised learning outperforms in optimality and training time.

02

Direct policy optimization faces challenges with complex dynamics.

03

The proposed unified paradigm significantly enhances control performance and robustness.

Abstract

This work is concerned with solving neural network-based feedback controllers efficiently for optimal control problems. We first conduct a comparative study of two prevalent approaches: offline supervised learning and online direct policy optimization. Albeit the training part of the supervised learning approach is relatively easy, the success of the method heavily depends on the optimal control dataset generated by open-loop optimal control solvers. In contrast, direct policy optimization turns the optimal control problem into an optimization problem directly without any requirement of pre-computing, but the dynamics-related objective can be hard to optimize when the problem is complicated. Our results underscore the superiority of offline supervised learning in terms of both optimality and training time. To overcome the main challenges, dataset and optimization, in the two approaches…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yzhao98/deepoptimalcontrol
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Machine Learning in Materials Science · Advanced Neural Network Applications