Identifying the Best Arm in the Presence of Global Environment Shifts

Phurinut Srisawad; Juergen Branke; Long Tran-Thanh

arXiv:2408.12581·cs.LG·August 23, 2024

Identifying the Best Arm in the Presence of Global Environment Shifts

Phurinut Srisawad, Juergen Branke, Long Tran-Thanh

PDF

Open Access

TL;DR

This paper introduces a new problem setting for best-arm identification in non-stationary bandits affected by global environmental shifts, proposing novel policies that outperform existing methods in practice.

Contribution

The paper formulates a unique non-stationary bandit problem with global shifts and develops robust selection and allocation policies tailored to this setting.

Findings

01

Proposed policies outperform existing methods in empirical tests.

02

New algorithms effectively handle global environmental shifts.

03

Significant improvement over traditional bandit solutions.

Abstract

This paper formulates a new Best-Arm Identification problem in the non-stationary stochastic bandits setting, where the means of all arms are shifted in the same way due to a global influence of the environment. The aim is to identify the unique best arm across environmental change given a fixed total budget. While this setting can be regarded as a special case of Adversarial Bandits or Corrupted Bandits, we demonstrate that existing solutions tailored to those settings do not fully utilise the nature of this global influence, and thus, do not work well in practice (despite their theoretical guarantees). To overcome this issue, in this paper we develop a novel selection policy that is consistent and robust in dealing with global environmental shifts. We then propose an allocation policy, LinLUCB, which exploits information about global shifts across all arms in each environment.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Forecasting Techniques and Applications