Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation

Chen Yan; Weina Wang; Lei Ying

arXiv:2410.15003·math.OC·May 27, 2025

Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation

Chen Yan, Weina Wang, Lei Ying

PDF

Open Access 1 Video

TL;DR

This paper introduces a Gaussian approximation-based policy for finite-horizon Restless Multi-Armed Bandits that achieves an optimality gap of order 1/N in degenerate cases, improving upon previous bounds.

Contribution

It presents the first stochastic programming approach using Gaussian approximation to attain an O(1/N) optimality gap in degenerate RMABs, extending beyond non-degenerate cases.

Findings

01

Achieves O(1/N) optimality gap for degenerate RMABs.

02

Uses Gaussian stochastic systems to better approximate RMAB dynamics.

03

First to establish such optimality gap in degenerate settings.

Abstract

We study the finite-horizon Restless Multi-Armed Bandit (RMAB) problem with $N$ homogeneous arms. Prior work has shown that when an RMAB satisfies a non-degeneracy condition, Linear-Programming-based (LP-based) policies derived from the fluid approximation, which captures the mean dynamics of the system, achieve an exponentially small optimality gap. However, it is common for RMABs to be degenerate, in which case LP-based policies can result in a $Θ (1/ N)$ optimality gap per arm. In this paper, we propose a novel Stochastic-Programming-based (SP-based) policy that, under a uniqueness assumption, achieves an $\tilde{O} (1/ N)$ optimality gap for degenerate RMABs. Our approach is based on the construction of a Gaussian stochastic system that captures not only the mean but also the variance of the RMAB dynamics, resulting in a more accurate approximation than the fluid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Achieving $\tilde{\mathcal{O}}(1/N)$ Optimality Gap in Restless Bandits through Gaussian Approximation· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Cognitive Radio Networks and Spectrum Sensing

MethodsDiffusion