Balancing Exploration for Online Receding Horizon Learning Control with   Provable Regret Guarantees

Deepan Muthirayan; Jianjun Yuan; Pramod P. Khargonekar

arXiv:2010.07269·math.OC·November 2, 2022

Balancing Exploration for Online Receding Horizon Learning Control with Provable Regret Guarantees

Deepan Muthirayan, Jianjun Yuan, Pramod P. Khargonekar

PDF

Open Access

TL;DR

This paper introduces a novel online receding horizon control algorithm that balances exploration and exploitation in unknown linear systems, achieving provable sub-linear regret and constraint satisfaction guarantees.

Contribution

It proposes a new exploration method for receding horizon control that ensures persistent excitation and sub-linear regret bounds in unknown linear systems.

Findings

01

Regret bounded by O(T^{3/4})

02

Ensures persistent excitation for exploration

03

Balances exploration and exploitation effectively

Abstract

We address the problem of simultaneously learning and control in an online receding horizon control setting. We consider the control of an unknown linear dynamical system with general cost functions and affine constraints on the control input. Our goal is to develop an online learning algorithm that minimizes the dynamic regret, which is defined as the difference between the cumulative cost incurred by the algorithm and that of the best policy with full knowledge of the system, cost functions and state and that satisfies the control input constraints. We propose a novel approach to explore in an online receding horizon setting. The key challenge is to ensure that the control generated by the receding horizon controller is persistently exciting. Our approach is to apply a perturbation to the control input generated by the receding horizon controller that balances the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Advanced Control Systems Optimization