Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms
MohammadJavad Azizi, Thang Duong, Yasin Abbasi-Yadkori, Andr\'as, Gy\"orgy, Claire Vernade, Mohammad Ghavamzadeh

TL;DR
This paper introduces algorithms for non-stationary and meta-learning bandit problems that outperform standard methods, especially when the number of optimal arms is small relative to total arms, with regret bounds tailored to different settings.
Contribution
The paper proposes a reduction-based algorithm for non-stationary and meta-learning bandits that achieves improved regret bounds in regimes with few optimal arms.
Findings
Regret bounds are smaller than the baseline of O(\u221a{KNT}) in large task regimes.
For fixed task length, regret is bounded by O(NM77Md77) and improved to O(N77+N^{1/2}77) under additional assumptions.
Abstract
We study a sequential decision problem where the learner faces a sequence of -armed bandit tasks. The task boundaries might be known (the bandit meta-learning setting), or unknown (the non-stationary bandit setting). For a given integer , the learner aims to compete with the best subset of arms of size . We design an algorithm based on a reduction to bandit submodular maximization, and show that, for rounds comprised of tasks, in the regime of large number of tasks and small number of optimal arms , its regret in both settings is smaller than the simple baseline of that can be obtained by using standard algorithms designed for non-stationary bandit problems. For the bandit meta-learning problem with fixed task length , we show that the regret of the algorithm is bounded as . Under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Machine Learning and Algorithms
