A General Framework of Multi-Armed Bandit Processes by Arm Switch   Restrictions

Wenqing Bao; Xiaoqiang Cai; Xianyi Wu

arXiv:1808.06314·math.PR·December 28, 2021

A General Framework of Multi-Armed Bandit Processes by Arm Switch Restrictions

Wenqing Bao, Xiaoqiang Cai, Xianyi Wu

PDF

Open Access

TL;DR

This paper introduces a unified framework for multi-armed bandit processes with switch restrictions, extending classical models and simplifying the proof of Gittins index policy optimality.

Contribution

It develops a general theory for MAB processes with switch restrictions, unifying various existing models and introducing new proof techniques for Gittins index optimality.

Findings

01

Gittins index process constructed under switch restrictions

02

Optimality of Gittins index rule established in the new framework

03

Framework encompasses classical and new MAB models

Abstract

This paper proposes a general framework of multi-armed bandit (MAB) processes by introducing a type of restrictions on the switches among arms evolving in continuous time. The Gittins index process is constructed for any single arm subject to the restrictions on switches and then the optimality of the corresponding Gittins index rule is established. The Gittins indices defined in this paper are consistent with the ones for MAB processes in continuous time, integer time, semi-Markovian setting as well as general discrete time setting, so that the new theory covers the classical models as special cases and also applies to many other situations that have not yet been touched in the literature. While the proof of the optimality of Gittins index policies benefits from ideas in the existing theory of MAB processes in continuous time, new techniques are introduced which drastically simplify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Optimization and Search Problems