Stochastic Rising Bandits
Alberto Maria Metelli, Francesco Trov\`o, Matteo Pirola, Marcello, Restelli

TL;DR
This paper introduces specialized algorithms for stochastic rising bandits, leveraging the monotonic payoff property to achieve tight regret bounds and demonstrating their effectiveness through empirical comparisons with existing methods.
Contribution
It presents novel algorithms for rested and restless stochastic rising bandits with regret bounds, and empirically validates their performance against state-of-the-art approaches.
Findings
Algorithms achieve regret bounds of approximately $ ilde{O}(T^{2/3})$.
Proposed methods outperform existing algorithms on synthetic and real-world data.
Effective in online model selection and non-stationary bandit scenarios.
Abstract
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e., those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. arm). We study a particular case of the rested and restless bandits in which the arms' expected payoff is monotonically non-decreasing. This characteristic allows designing specifically crafted algorithms that exploit the regularity of the payoffs to provide tight regret bounds. We design an algorithm for the rested case (R-ed-UCB) and one for the restless case (R-less-UCB), providing a regret bound depending on the properties of the instance and, under certain circumstances, of . We empirically compare our algorithms with state-of-the-art methods for non-stationary MABs over several synthetically generated tasks and an online model selection problem for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Risk and Portfolio Optimization
