A/B Testing and Best-arm Identification for Linear Bandits with   Robustness to Non-stationarity

Zhihan Xiong; Romain Camilleri; Maryam Fazel; Lalit Jain; Kevin; Jamieson

arXiv:2307.15154·cs.LG·February 16, 2024

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin, Jamieson

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new algorithm for linear bandit best-arm identification in non-stationary environments, combining robustness to changing parameters with fast identification rates, outperforming existing methods in diverse settings.

Contribution

We propose the P1-RAGE algorithm that balances robustness to non-stationarity with rapid identification, filling a gap in existing linear bandit algorithms.

Findings

01

P1-RAGE maintains performance comparable to G-optimal design in worst cases.

02

The algorithm achieves faster identification rates in benign, stationary environments.

03

Empirical results show P1-RAGE outperforms existing algorithms across various scenarios.

Abstract

We investigate the fixed-budget best-arm identification (BAI) problem for linear bandits in a potentially non-stationary environment. Given a finite arm set $X \subset R^{d}$ , a fixed budget $T$ , and an unpredictable sequence of parameters ${θ_{t}}_{t = 1}^{T}$ , an algorithm will aim to correctly identify the best arm $x^{*} := ar g max_{x \in X} x^{⊤} \sum_{t = 1}^{T} θ_{t}$ with probability as high as possible. Prior work has addressed the stationary setting where $θ_{t} = θ_{1}$ for all $t$ and demonstrated that the error probability decreases as $exp (- T / ρ^{*})$ for a problem-dependent constant $ρ^{*}$ . But in many real-world $A / B / n$ multivariate testing scenarios that motivate our work, the environment is non-stationary and an algorithm expecting a stationary setting can easily fail. For robust identification, it is well-known…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fftypezero/bobw_linear
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Advanced Bandit Algorithms Research · Advanced Statistical Process Monitoring

Methodsfail