Singular Perturbation-based Reinforcement Learning of Two-Point Boundary   Optimal Control Systems

Vasanth Reddy; Hoda Eldardiry; Almuatazbellah Boker

arXiv:2104.09652·eess.SY·May 3, 2021

Singular Perturbation-based Reinforcement Learning of Two-Point Boundary Optimal Control Systems

Vasanth Reddy, Hoda Eldardiry, Almuatazbellah Boker

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning approach for two-point boundary optimal control of linear time-varying systems, leveraging singular perturbation theory to transform the problem into simpler subproblems, enabling effective learning of control gains.

Contribution

It develops a novel method combining singular perturbation theory with reinforcement learning to solve boundary control problems for systems with unknown dynamics.

Findings

01

Performance of the learned controller approaches the optimal with increasing time horizon.

02

The method effectively transforms a time-varying problem into time-invariant subproblems.

03

Simulation results verify the theoretical approximation accuracy.

Abstract

This work presents a technique for learning systems, where the learning process is guided by knowledge of the physics of the system. In particular, we solve the problem of the two-point boundary optimal control problem of linear time-varying systems with unknown model dynamics using reinforcement learning. Borrowing techniques from singular perturbation theory, we transform the time-varying optimal control problem into a couple of time-invariant subproblems. This allows the utilization of an off-policy iteration method to learn the controller gains. We show that the performance of the learning-based controller approximates that of the model-based optimal controller and the accuracy of the approximation improves as the time horizon of the control problem increases. Finally, we provide a simulation example to verify the results of the paper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Iterative Learning Control Systems · Advanced Control Systems Optimization