Adaptive Smooth Non-Stationary Bandits

Joe Suk

arXiv:2407.08654·stat.ML·February 27, 2025

Adaptive Smooth Non-Stationary Bandits

Joe Suk

PDF

Open Access 1 Repo

TL;DR

This paper establishes the minimax dynamic regret rates for smooth non-stationary bandit models, develops adaptive algorithms without prior knowledge of smoothness parameters, and explores faster gap-dependent regret rates in environments with safe arms.

Contribution

It provides the first general minimax regret bounds for all smoothness parameters in non-stationary bandits and introduces adaptive algorithms that do not require prior parameter knowledge.

Findings

01

Established minimax dynamic regret rates for all parameters.

02

Designed adaptive algorithms achieving these rates without prior knowledge.

03

Identified conditions under which faster gap-dependent regret rates are possible.

Abstract

We study a $K$ -armed non-stationary bandit model where rewards change smoothly, as captured by H\"{o}lder class assumptions on rewards as functions of time. Such smooth changes are parametrized by a H\"{o}lder exponent $β$ and coefficient $λ$ . While various sub-cases of this general model have been studied in isolation, we first establish the minimax dynamic regret rate generally for all $K, β, λ$ . Next, we show this optimal dynamic regret can be attained adaptively, without knowledge of $β, λ$ . To contrast, even with parameter knowledge, upper bounds were only previously known for limited regimes $β \leq 1$ and $β = 2$ (Slivkins, 2014; Krishnamurthy and Gopalan, 2021; Manegueu et al., 2021; Jia et al.,2023). Thus, our work resolves open questions raised by these disparate threads of the literature. We also study the problem of attaining faster…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joesuk/smoothbandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Data Stream Mining Techniques