Online Policies for Real-Time Control Using MRAC-RL

Anubhav Guha; Anuradha Annaswamy

arXiv:2103.16551·eess.SY·October 20, 2021

Online Policies for Real-Time Control Using MRAC-RL

Anubhav Guha, Anuradha Annaswamy

PDF

TL;DR

This paper introduces MRAC-RL, a novel framework combining adaptive control and reinforcement learning to develop online control policies that adapt in real-time to modeling errors in dynamical systems, demonstrated on a quadrotor landing task.

Contribution

The paper presents a new MRAC-RL framework with stability guarantees, novel adaptive algorithms, and demonstrates improved real-time control in nonlinear systems over existing RL methods.

Findings

01

MRAC-RL outperforms state-of-the-art RL algorithms in simulations.

02

The framework provides stability guarantees for nonlinear control.

03

Adaptive tracking is successfully achieved in the quadrotor landing task.

Abstract

In this paper, we propose the Model Reference Adaptive Control & Reinforcement Learning (MRAC-RL) approach to developing online policies for systems in which modeling errors occur in real-time. Although reinforcement learning (RL) algorithms have been successfully used to develop control policies for dynamical systems, discrepancies between simulated dynamics and the true target dynamics can cause trained policies to fail to generalize and adapt appropriately when deployed in the real-world. The MRAC-RL framework generates online policies by utilizing an inner-loop adaptive controller together with a simulation-trained outer-loop RL policy. This structure allows MRAC-RL to adapt and operate effectively in a target environment, even when parametric uncertainties exists. We propose a set of novel MRAC algorithms, apply them to a class of nonlinear systems, derive the associated control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.