Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits
Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe

TL;DR
This paper demonstrates the limitations of Whittle index policies for restless bandits and introduces a mean-field based algorithm that is provably near-optimal, hyper-parameter free, and performs well in practice.
Contribution
The authors propose a new mean-field planning algorithm for RMABs that overcomes the limitations of Whittle index policies and provides rigorous theoretical guarantees.
Findings
Whittle index policies can fail even in simple RMAB settings.
The mean-field algorithm is hyper-parameter free and provably near-optimal.
Experimental results show the mean-field approach outperforms existing baselines.
Abstract
We study the problem of planning restless multi-armed bandits (RMABs) with multiple actions. This is a popular model for multi-agent systems with applications like multi-channel communication, monitoring and machine maintenance tasks, and healthcare. Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions. In this work, we first show that Whittle index policies can fail in simple and practically relevant RMAB settings, even when the RMABs are indexable. We discuss why the optimality guarantees fail and why asymptotic optimality may not translate well to practically relevant planning horizons. We then propose an alternate planning algorithm based on the mean-field method, which can provably and efficiently obtain near-optimal policies with a large number of arms, without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Optimization and Search Problems
Methodsfail
