Continuous-time multi-armed bandits under random intervention times

Kei Noba; Jos\'e Luis P\'erez; Kazutoshi Yamazaki; Qingyuan Zhang

arXiv:2603.03661·math.OC·March 5, 2026

Continuous-time multi-armed bandits under random intervention times

Kei Noba, Jos\'e Luis P\'erez, Kazutoshi Yamazaki, Qingyuan Zhang

PDF

Open Access

TL;DR

This paper studies multi-armed bandit problems with actions occurring at random times, providing explicit Gittins index characterizations for various stochastic processes, supported by numerical experiments.

Contribution

It offers explicit Gittins index formulas for arms modeled by Lévy processes under random intervention times, extending classical bandit theory.

Findings

01

Explicit Gittins index for Lévy process arms.

02

Closed-form Gittins index for exponential inter-arrival times.

03

Numerical validation of theoretical index formulas.

Abstract

This paper examines multi-armed bandits in which actions are taken at random discrete times. The model consists of $J$ independent arms. When an arm is operated, it must remain active for a random duration, modeled by the inter-arrival time of a (possibly arm-dependent) renewal process. For arms evolving as a L\'evy process, we provide an explicit characterization of the Gittins index, which is known to yield an optimal strategy. Furthermore, when the inter-arrival times are exponential and the arms evolve as either a spectrally negative L\'evy process, a reflected spectrally negative L\'evy process, or a diffusion process, the Gittins index is explicitly characterized in terms of the scale function or diffusion characteristics, respectively. Numerical experiments are performed to support the theoretical results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics