Two-step reinforcement learning for model-free redesign of nonlinear   optimal regulator

Mei Minami; Yuka Masumoto; Yoshihiro Okawa; Tomotake Sasaki; Yutaka; Hori

arXiv:2103.03808·eess.SY·December 1, 2023

Two-step reinforcement learning for model-free redesign of nonlinear optimal regulator

Mei Minami, Yuka Masumoto, Yoshihiro Okawa, Tomotake Sasaki, Yutaka, Hori

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-step model-free reinforcement learning method for redesigning nonlinear optimal regulators, improving learning efficiency and transient performance without requiring system modeling.

Contribution

It proposes a novel two-step approach combining offline linear control design with online RL for nonlinear systems, with theoretical convergence guarantees.

Findings

01

Improved transient learning performance demonstrated in simulations.

02

Enhanced efficiency in hyperparameter tuning of RL.

03

Theoretical proof of convergence to LQR controller.

Abstract

In many practical control applications, the performance level of a closed-loop system degrades over time due to the change of plant characteristics. Thus, there is a strong need for redesigning a controller without going through the system modeling process, which is often difficult for closed-loop systems. Reinforcement learning (RL) is one of the promising approaches that enable model-free redesign of optimal controllers for nonlinear dynamical systems based only on the measurement of the closed-loop system. However, the learning process of RL usually requires a considerable number of trial-and-error experiments using the poorly controlled system that may accumulate wear on the plant. To overcome this limitation, we propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hori-group/two-step-design
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMechanical Circulatory Support Devices · Adaptive Dynamic Programming Control