Autoregressive Bandits
Francesco Bacchiocchi, Gianmarco Genalti, Davide Maran, Marco Mussi,, Marcello Restelli, Nicola Gatti, Alberto Maria Metelli

TL;DR
This paper introduces Autoregressive Bandits, a new online learning framework for decision-making with autoregressive rewards, proposing an algorithm with sublinear regret and demonstrating its effectiveness through empirical validation.
Contribution
It formulates the Autoregressive Bandits setting, develops the AR-UCB algorithm with theoretical regret bounds, and empirically shows its advantages over existing bandit algorithms.
Findings
AR-UCB achieves sublinear regret of order rac{rac{(k+1)^{3/2}\u221a{nT}}{(1-\u0393)^2}}
The optimal policy can be efficiently computed under mild assumptions.
Empirical results demonstrate AR-UCB's robustness and superiority over baseline algorithms.
Abstract
Autoregressive processes naturally arise in a large variety of real-world scenarios, including stock markets, sales forecasting, weather prediction, advertising, and pricing. When facing a sequential decision-making problem in such a context, the temporal dependence between consecutive observations should be properly accounted for guaranteeing convergence to the optimal policy. In this work, we propose a novel online learning setting, namely, Autoregressive Bandits (ARBs), in which the observed reward is governed by an autoregressive process of order , whose parameters depend on the chosen action. We show that, under mild assumptions on the reward process, the optimal policy can be conveniently computed. Then, we devise a new optimistic regret minimization algorithm, namely, AutoRegressive Upper Confidence Bound (AR-UCB), that suffers sublinear regret of order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Smart Grid Energy Management
