LP-based policies for restless bandits: necessary and sufficient conditions for (exponentially fast) asymptotic optimality
Nicolas Gast (POLARIS), Bruno Gaujal (POLARIS), Chen Yan (POLARIS)

TL;DR
This paper develops LP-based control policies for restless bandits, establishing conditions for their asymptotic optimality and introducing policies that outperform existing heuristics, especially under model uncertainties.
Contribution
The paper provides necessary and sufficient conditions for asymptotic optimality of LP-based policies and introduces the LP-index and LP-update policies with proven convergence rates.
Findings
LP-index policy is asymptotically optimal with square root convergence rate.
LP-update policy outperforms LP-index and other heuristics in experiments.
LP-update policy is robust to transition matrix estimation errors.
Abstract
We provide a framework to analyse control policies for the restless Markovian bandit model, under both finite and infinite time horizon. We show that when the population of arms goes to infinity, the value of the optimal control policy converges to the solution of a linear program (LP). We provide necessary and sufficient conditions for a generic control policy to be: i) asymptotically optimal; ii) asymptotically optimal with square root convergence rate; iii) asymptotically optimal with exponential rate. We then construct the LP-index policy that is asymptotically optimal with square root convergence rate on all models, and with exponential rate if the model is non-degenerate in finite horizon, and satisfies a uniform global attractor property in infinite horizon. We next define the LP-update policy, which is essentially a repeated LP-index policy that solves a new linear program at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
