Autoregressive Bandits

Francesco Bacchiocchi; Gianmarco Genalti; Davide Maran; Marco Mussi,; Marcello Restelli; Nicola Gatti; Alberto Maria Metelli

arXiv:2212.06251·cs.LG·February 21, 2024

Autoregressive Bandits

Francesco Bacchiocchi, Gianmarco Genalti, Davide Maran, Marco Mussi,, Marcello Restelli, Nicola Gatti, Alberto Maria Metelli

PDF

Open Access 1 Repo

TL;DR

This paper introduces Autoregressive Bandits, a new online learning framework for decision-making with autoregressive rewards, proposing an algorithm with sublinear regret and demonstrating its effectiveness through empirical validation.

Contribution

It formulates the Autoregressive Bandits setting, develops the AR-UCB algorithm with theoretical regret bounds, and empirically shows its advantages over existing bandit algorithms.

Findings

01

AR-UCB achieves sublinear regret of order rac{rac{(k+1)^{3/2}\u221a{nT}}{(1-\u0393)^2}}

02

The optimal policy can be efficiently computed under mild assumptions.

03

Empirical results demonstrate AR-UCB's robustness and superiority over baseline algorithms.

Abstract

Autoregressive processes naturally arise in a large variety of real-world scenarios, including stock markets, sales forecasting, weather prediction, advertising, and pricing. When facing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for guaranteeing convergence to the optimal policy. In this work, we propose a novel online learning setting, namely, Autoregressive Bandits (ARBs), in which the observed reward is governed by an autoregressive process of order $k$ , whose parameters depend on the chosen action. We show that, under mild assumptions on the reward process, the optimal policy can be conveniently computed. Then, we devise a new optimistic regret minimization algorithm, namely, AutoRegressive Upper Confidence Bound (AR-UCB), that suffers sublinear regret of order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gianmarcogenalti/autoregressive-bandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Smart Grid Energy Management