Linear Bandits with Memory: from Rotting to Rising

Giulia Clerici; Pierre Laforgue; Nicol\`o Cesa-Bianchi

arXiv:2302.08345·cs.LG·May 26, 2023·1 cites

Linear Bandits with Memory: from Rotting to Rising

Giulia Clerici, Pierre Laforgue, Nicol\`o Cesa-Bianchi

PDF

Open Access

TL;DR

This paper introduces a nonstationary linear bandit model incorporating memory effects, analyzing regret bounds and proposing algorithms for both known and unknown parameters, with empirical validation.

Contribution

It develops a novel nonstationary linear bandit model with memory, providing regret analysis and algorithms for unknown parameters, extending the applicability of bandit methods to dynamic environments.

Findings

01

Regret bound of order $ ilde{O}( ext{poly}(d,m,eta,T))$ for the proposed algorithm.

02

Algorithm performs well in experiments against natural baselines.

03

Model captures both rotting and rising phenomena in nonstationary bandit settings.

Abstract

Nonstationary phenomena, such as satiation effects in recommendations, have mostly been modeled using bandits with finitely many arms. However, the richer action space provided by linear bandits is often preferred in practice. In this work, we introduce a novel nonstationary linear bandit model, where current rewards are influenced by the learner's past actions in a fixed-size window. Our model, which recovers stationary linear bandits as a special case, leverages two parameters: the window size $m \geq 0$ , and an exponent $γ$ that captures the rotting ( $γ < 0)$ or rising ( $γ > 0$ ) nature of the phenomenon. When both $m$ and $γ$ are known, we propose and analyze a variant of OFUL which minimizes regret against cycling policies. By choosing the cycle length so as to trade-off approximation and estimation errors, we then prove a bound of order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Smart Grid Energy Management