Two-step reinforcement learning for model-free redesign of nonlinear optimal regulator
Mei Minami, Yuka Masumoto, Yoshihiro Okawa, Tomotake Sasaki, Yutaka, Hori

TL;DR
This paper introduces a two-step model-free reinforcement learning method for redesigning nonlinear optimal regulators, improving learning efficiency and transient performance without requiring system modeling.
Contribution
It proposes a novel two-step approach combining offline linear control design with online RL for nonlinear systems, with theoretical convergence guarantees.
Findings
Improved transient learning performance demonstrated in simulations.
Enhanced efficiency in hyperparameter tuning of RL.
Theoretical proof of convergence to LQR controller.
Abstract
In many practical control applications, the performance level of a closed-loop system degrades over time due to the change of plant characteristics. Thus, there is a strong need for redesigning a controller without going through the system modeling process, which is often difficult for closed-loop systems. Reinforcement learning (RL) is one of the promising approaches that enable model-free redesign of optimal controllers for nonlinear dynamical systems based only on the measurement of the closed-loop system. However, the learning process of RL usually requires a considerable number of trial-and-error experiments using the poorly controlled system that may accumulate wear on the plant. To overcome this limitation, we propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMechanical Circulatory Support Devices · Adaptive Dynamic Programming Control
