Adapting Bandit Algorithms for Settings with Sequentially Available Arms
Marco Gabrielli, Francesco Trov\`o, Manuela Antonelli

TL;DR
This paper introduces a meta-algorithm called Seq that adapts classical Multi-Armed Bandit policies for scenarios where options are presented sequentially, improving information gathering and maintaining theoretical guarantees.
Contribution
The paper proposes Seq, a meta-algorithm that modifies classical MAB policies for sequentially available arms, enhancing performance in real-world applications.
Findings
Seq improves empirical performance on synthetic and real datasets.
Seq maintains theoretical guarantees of classical MAB policies.
Seq is effective in advertising and environmental monitoring applications.
Abstract
Although the classical version of the Multi-Armed Bandits (MAB) framework has been applied successfully to several practical problems, in many real-world applications, the possible actions are not presented to the learner simultaneously, such as in the Internet campaign management and environmental monitoring settings. Instead, in such applications, a set of options is presented sequentially to the learner within a time span, and this process is repeated throughout a time horizon. At each time, the learner is asked whether to select the proposed option or not. We define this scenario as the Sequential Pull/No-pull Bandit setting, and we propose a meta-algorithm, namely Sequential Pull/No-pull for MAB (Seq), to adapt any classical MAB policy to better suit this setting for both the regret minimization and best-arm identification problems. By allowing the selection of multiple arms within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Machine Learning and Algorithms
