Logarithmic Regret in Adaptive Control of Noisy Linear Quadratic Regulator Systems Using Hints
Mohammad Akbari, Bahman Gharesifard, Tamas Linder

TL;DR
This paper demonstrates that poly-logarithmic regret in adaptive control of noisy linear-quadratic systems is achievable when both system matrices are unknown, given periodic hints, advancing understanding of regret bounds in such systems.
Contribution
It introduces a new approach showing poly-logarithmic regret is possible with unknown matrices when hints are provided, extending prior results.
Findings
Poly-logarithmic regret is achievable with hints.
The method applies to both single-matrix unknown scenarios.
Provides a new theoretical bound for adaptive control.
Abstract
The problem of regret minimization for online adaptive control of linear-quadratic systems is studied. In this problem, the true system transition parameters (matrices and ) are unknown, and the objective is to design and analyze algorithms that generate control policies with sublinear regret. Recent studies show that when the system parameters are fully unknown, there exists a choice of these parameters such that any algorithm that only uses data from the past system trajectory at best achieves a square root of time horizon regret bound, providing a hard fundamental limit on the achievable regret in general. However, it is also known that (poly)-logarithmic regret is achievable when only matrix or only matrix is unknown. We present a result, encompassing both scenarios, showing that (poly)-logarithmic regret is achievable when both of these matrices are unknown, but a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management
