Online Learning of Whittle Indices for Restless Bandits with Non-Stationary Transition Kernels

Md Kamran Chowdhury Shisher; Vishrant Tripathi; Mung Chiang; Christopher G. Brinton

arXiv:2506.18186·cs.LG·April 22, 2026

Online Learning of Whittle Indices for Restless Bandits with Non-Stationary Transition Kernels

Md Kamran Chowdhury Shisher, Vishrant Tripathi, Mung Chiang, Christopher G. Brinton

PDF

TL;DR

This paper introduces an adaptive online algorithm for resource allocation in non-stationary restless bandit problems, leveraging Whittle indices with theoretical guarantees and practical efficiency.

Contribution

It proposes a Sliding-Window Online Whittle policy that adapts to unknown, time-varying dynamics with sub-linear regret guarantees and a method to tune window size online.

Findings

01

Algorithm achieves sub-linear dynamic regret.

02

Outperforms baselines in non-stationary environments.

03

Effectively adapts to unknown variation budgets.

Abstract

The restless multi-armed bandit (RMAB) framework is a popular approach to solving resource allocation problems in networked systems. In this paper, we study optimal resource allocation in RMABs facing unknown and non-stationary dynamics. Solving RMABs optimally is known to be PSPACE-hard even with full knowledge of model parameters. While Whittle index policies offer asymptotic optimality with low computational cost, they require access to stationary transition kernels, an unrealistic assumption in many modern networking applications. To address this challenge, we propose a Sliding-Window Online Whittle (SW-Whittle) policy that remains computationally efficient while adapting to time-varying kernels. Through theoretical analysis, we show that our algorithm achieves sub-linear dynamic regret with respect to the number of episodes. We further address the important case where the variation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.