Neural Index Policies for Restless Multi-Action Bandits with Heterogeneous Budgets

Himadri S. Pandey; Kai Wang; Gian-Gabriel P. Garcia

arXiv:2510.22069·cs.LG·October 28, 2025

Neural Index Policies for Restless Multi-Action Bandits with Heterogeneous Budgets

Himadri S. Pandey, Kai Wang, Gian-Gabriel P. Garcia

PDF

TL;DR

This paper introduces a neural network-based index policy for complex restless multi-armed bandit problems with multiple actions and heterogeneous budgets, enabling scalable, near-optimal decision-making.

Contribution

It proposes a novel neural index policy framework that unifies index prediction and constrained optimization for multi-action RMABs with heterogeneous budgets.

Findings

01

Achieves within 5% of oracle performance

02

Strictly enforces heterogeneous budget constraints

03

Scales to hundreds of arms efficiently

Abstract

Restless multi-armed bandits (RMABs) provide a scalable framework for sequential decision-making under uncertainty, but classical formulations assume binary actions and a single global budget. Real-world settings, such as healthcare, often involve multiple interventions with heterogeneous costs and constraints, where such assumptions break down. We introduce a Neural Index Policy (NIP) for multi-action RMABs with heterogeneous budget constraints. Our approach learns to assign budget-aware indices to arm--action pairs using a neural network, and converts them into feasible allocations via a differentiable knapsack layer formulated as an entropy-regularized optimal transport (OT) problem. The resulting model unifies index prediction and constrained optimization in a single end-to-end differentiable framework, enabling gradient-based training directly on decision quality. The network is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.