From Theory to Practice with RAVEN-UCB: Addressing Non-Stationarity in Multi-Armed Bandits through Variance Adaptation

Junyi Fang; Yuxun Chen; Yuxin Chen; Chen Zhang

arXiv:2506.02933·cs.LG·June 4, 2025

From Theory to Practice with RAVEN-UCB: Addressing Non-Stationarity in Multi-Armed Bandits through Variance Adaptation

Junyi Fang, Yuxun Chen, Yuxin Chen, Chen Zhang

PDF

Open Access 1 Repo

TL;DR

RAVEN-UCB is a new algorithm for non-stationary multi-armed bandits that adaptively uses variance information to improve exploration and achieve tighter regret bounds, demonstrating superior performance in dynamic environments.

Contribution

It introduces RAVEN-UCB, a variance-aware, adaptive algorithm with recursive updates, providing both theoretical guarantees and practical improvements over existing methods.

Findings

01

Achieves tighter regret bounds than UCB1 and UCB-V.

02

Performs better in non-stationary environments with distributional, periodic, and fluctuating changes.

03

Demonstrates robustness and efficiency in synthetic and logistics scenarios.

Abstract

The Multi-Armed Bandit (MAB) problem is challenging in non-stationary environments where reward distributions evolve dynamically. We introduce RAVEN-UCB, a novel algorithm that combines theoretical rigor with practical efficiency via variance-aware adaptation. It achieves tighter regret bounds than UCB1 and UCB-V, with gap-dependent regret of order $K σ_{m a x}^{2} lo g T /Δ$ and gap-independent regret of order $K T lo g T$ . RAVEN-UCB incorporates three innovations: (1) variance-driven exploration using $\overset{σ}{^}_{k}^{2} / (N_{k} + 1)$ in confidence bounds, (2) adaptive control via $α_{t} = α_{0} / lo g (t + ϵ)$ , and (3) constant-time recursive updates for efficiency. Experiments across non-stationary patterns - distributional changes, periodic shifts, and temporary fluctuations - in synthetic and logistics scenarios demonstrate its superiority over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

66661654/Raven-UCB
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Forecasting Techniques and Applications · Smart Grid Energy Management