Dynamic Regret Minimization for Control of Non-stationary Linear   Dynamical Systems

Yuwei Luo; Varun Gupta; Mladen Kolar

arXiv:2111.03772·cs.LG·March 21, 2022

Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

Yuwei Luo, Varun Gupta, Mladen Kolar

PDF

TL;DR

This paper introduces an adaptive control algorithm for non-stationary linear dynamical systems that achieves near-optimal regret bounds by detecting and adapting to changes in system dynamics.

Contribution

It presents a novel non-stationarity detection strategy for LQR control, achieving optimal dynamic regret bounds under unknown and changing system dynamics.

Findings

01

Achieves optimal dynamic regret of V_T^{2/5}T^{3/5} for general non-stationary dynamics.

02

Attains optimal regret of ilde{ ext{O}}(\sqrt{ST}) for piece-wise constant dynamics.

03

Demonstrates that non-adaptive forgetting methods may be suboptimal for non-stationary LQR control.

Abstract

We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon $T$ with fixed and known cost matrices $Q, R$ , but unknown and non-stationary dynamics ${A_{t}, B_{t}}$ . The sequence of dynamics matrices can be arbitrary, but with a total variation, $V_{T}$ , assumed to be $o (T)$ and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all $t$ , we present an algorithm that achieves the optimal dynamic regret of $\tilde{O} (V_{T}^{2/5} T^{3/5})$ . With piece-wise constant dynamics, our algorithm achieves the optimal regret of $\tilde{O} (S T)$ where $S$ is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.