Generalized non-stationary bandits
Anne Gael Manegueu, Alexandra Carpentier, Yi Yu

TL;DR
This paper introduces a unified algorithm for a broad class of non-stationary stochastic bandit problems, including switching, polynomial, smooth, and inflection-limited mean scenarios, with controlled variations and level sets.
Contribution
The paper proposes a single, efficient algorithm that addresses multiple non-stationary bandit settings with different mean variation structures.
Findings
The algorithm effectively handles switching and smoothly varying means.
It manages local polynomial and inflection-limited mean functions.
Unified approach simplifies solving diverse non-stationary bandit problems.
Abstract
In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Sparse and Compressive Sensing Techniques
