Generalized non-stationary bandits

Anne Gael Manegueu; Alexandra Carpentier; Yi Yu

arXiv:2102.00725·stat.ML·February 3, 2021·1 cites

Generalized non-stationary bandits

Anne Gael Manegueu, Alexandra Carpentier, Yi Yu

PDF

Open Access

TL;DR

This paper introduces a unified algorithm for a broad class of non-stationary stochastic bandit problems, including switching, polynomial, smooth, and inflection-limited mean scenarios, with controlled variations and level sets.

Contribution

The paper proposes a single, efficient algorithm that addresses multiple non-stationary bandit settings with different mean variation structures.

Findings

01

The algorithm effectively handles switching and smoothly varying means.

02

It manages local polynomial and inflection-limited mean functions.

03

Unified approach simplifies solving diverse non-stationary bandit problems.

Abstract

In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Sparse and Compressive Sensing Techniques