Finite-Horizon Single-Pull Restless Bandits: An Efficient Index Policy   For Scarce Resource Allocation

Guojun Xiong; Haichuan Wang; Yuqi Pan; Saptarshi Mandal; Sanket Shah,; Niclas Boehmer; and Milind Tambe

arXiv:2501.06103·cs.MA·January 13, 2025

Finite-Horizon Single-Pull Restless Bandits: An Efficient Index Policy For Scarce Resource Allocation

Guojun Xiong, Haichuan Wang, Yuqi Pan, Saptarshi Mandal, Sanket Shah,, Niclas Boehmer, and Milind Tambe

PDF

Open Access

TL;DR

This paper introduces a new variant of restless bandits called Finite-Horizon Single-Pull RMABs, designed for resource-scarce scenarios where each agent can only be allocated once, and proposes an efficient index policy with proven near-optimality.

Contribution

We define the SPRMAB model with a single-pull constraint and develop a lightweight index policy that achieves sub-linear optimality gap, validated through extensive simulations.

Findings

01

The index policy achieves a sub-linearly decaying average optimality gap of rac{1}{ ho^{1/2}}.

02

The proposed method outperforms existing benchmarks in various simulation domains.

03

The dummy state expansion effectively enforces the single-pull constraint.

Abstract

Restless multi-armed bandits (RMABs) have been highly successful in optimizing sequential resource allocation across many domains. However, in many practical settings with highly scarce resources, where each agent can only receive at most one resource, such as healthcare intervention programs, the standard RMAB framework falls short. To tackle such scenarios, we introduce Finite-Horizon Single-Pull RMABs (SPRMABs), a novel variant in which each arm can only be pulled once. This single-pull constraint introduces additional complexity, rendering many existing RMAB solutions suboptimal or ineffective. %To address this, we propose using dummy states to duplicate the system, ensuring that once an arm is activated, it transitions exclusively within the dummy states. To address this shortcoming, we propose using \textit{dummy states} that expand the system and enforce the one-pull constraint.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Cognitive Radio Networks and Spectrum Sensing