Optimal exploration strategies for finite horizon regret minimization in   some adaptive control problems

K\'evin Colin; H{\aa}kan Hjalmarsson; Xavier Bombois

arXiv:2211.07949·math.OC·November 16, 2022

Optimal exploration strategies for finite horizon regret minimization in some adaptive control problems

K\'evin Colin, H{\aa}kan Hjalmarsson, Xavier Bombois

PDF

Open Access

TL;DR

This paper investigates optimal exploration strategies for regret minimization in finite horizon adaptive control problems, showing that tailored exploration can reduce regret without external excitation.

Contribution

It introduces a new exploration approach called 'immediate' exploration and analyzes its effectiveness in finite horizon regret minimization for adaptive control.

Findings

01

Immediate exploration reduces regret without external excitation.

02

Optimal regret rates depend on the control horizon and exploration strategy.

03

Theoretical and simulation results support the proposed exploration method.

Abstract

In this work, we consider the problem of regret minimization in adaptive minimum variance and linear quadratic control problems. Regret minimization has been extensively studied in the literature for both types of adaptive control problems. Most of these works give results of the optimal rate of the regret in the asymptotic regime. In the minimum variance case, the optimal asymptotic rate for the regret is $lo g (T)$ which can be reached without any additional external excitation. On the contrary, for most adaptive linear quadratic problems, it is necessary to add an external excitation in order to get the optimal asymptotic rate of $T$ . In this paper, we will actually show from an a theoretical study, as well as, in simulations that when the control horizon is pre-specified a lower regret can be obtained with either no external excitation or a new exploration type termed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research