From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Improvement

Jiamin Xu; Ivan Nazarov; Aditya Rastogi; \'Africa Peri\'a\~nez; Kyra Gan

arXiv:2502.05145·cs.LG·April 7, 2026

From Restless to Contextual: A Thresholding Bandit Reformulation For Finite-horizon Improvement

Jiamin Xu, Ivan Nazarov, Aditya Rastogi, \'Africa Peri\'a\~nez, Kyra Gan

PDF

1 Repo

TL;DR

This paper reformulates restless bandit problems as budgeted thresholding contextual bandits, enabling faster convergence and improved finite-horizon performance with sublinear regret and empirical gains.

Contribution

It introduces a novel reformulation of restless bandits into a simpler thresholding bandit model and proves non-asymptotic optimality for a simplified setting.

Findings

01

Achieves sublinear regret in a multi-state, heterogeneous setting.

02

Demonstrates faster convergence than existing algorithms.

03

Empirically outperforms state-of-the-art methods in large-scale environments.

Abstract

This paper addresses the poor finite-horizon performance of existing online \emph{restless bandit} (RB) algorithms, which stems from the prohibitive sample complexity of learning a full \emph{Markov decision process} (MDP) for each agent. We argue that superior finite-horizon performance requires \emph{rapid convergence} to a \emph{high-quality} policy. Thus motivated, we introduce a reformulation of online RBs as a \emph{budgeted thresholding contextual bandit}, which simplifies the learning problem by encoding long-term state transitions into a scalar reward. We prove the first non-asymptotic optimality of an oracle policy for a simplified finite-horizon setting. We propose a practical learning policy under a heterogeneous-agent, multi-state setting, and show that it achieves a sublinear regret, achieving \emph{faster convergence} than existing methods. This directly translates to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jamie01713/EGT
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.