Optimal exploration strategies for finite horizon regret minimization in some adaptive control problems
K\'evin Colin, H{\aa}kan Hjalmarsson, Xavier Bombois

TL;DR
This paper investigates optimal exploration strategies for regret minimization in finite horizon adaptive control problems, showing that tailored exploration can reduce regret without external excitation.
Contribution
It introduces a new exploration approach called 'immediate' exploration and analyzes its effectiveness in finite horizon regret minimization for adaptive control.
Findings
Immediate exploration reduces regret without external excitation.
Optimal regret rates depend on the control horizon and exploration strategy.
Theoretical and simulation results support the proposed exploration method.
Abstract
In this work, we consider the problem of regret minimization in adaptive minimum variance and linear quadratic control problems. Regret minimization has been extensively studied in the literature for both types of adaptive control problems. Most of these works give results of the optimal rate of the regret in the asymptotic regime. In the minimum variance case, the optimal asymptotic rate for the regret is which can be reached without any additional external excitation. On the contrary, for most adaptive linear quadratic problems, it is necessary to add an external excitation in order to get the optimal asymptotic rate of . In this paper, we will actually show from an a theoretical study, as well as, in simulations that when the control horizon is pre-specified a lower regret can be obtained with either no external excitation or a new exploration type termed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research
