Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits
Yunlong Hou, Vincent Y. F. Tan, Zixin Zhong

TL;DR
This paper introduces a novel piecewise stationary linear bandit model and an algorithm, PSεBAI+, that efficiently identifies near-optimal arms with minimal samples despite unknown environments and changepoints.
Contribution
The paper presents a new model for piecewise stationary linear bandits and an optimal algorithm that detects changepoints and aligns contexts for effective arm identification.
Findings
PSεBAI+ achieves near-optimal sample complexity.
The algorithm effectively detects changepoints and aligns contexts.
Numerical experiments confirm the efficiency of PSεBAI+.
Abstract
We propose a {\em novel} piecewise stationary linear bandit (PSLB) model, where the environment randomly samples a context from an unknown probability distribution at each changepoint, and the quality of an arm is measured by its return averaged over all contexts. The contexts and their distribution, as well as the changepoints are unknown to the agent. We design {\em Piecewise-Stationary -Best Arm Identification} (PSBAI), an algorithm that is guaranteed to identify an -optimal arm with probability and with a minimal number of samples. PSBAI consists of two subroutines, PSBAI and {\sc Na\"ive -BAI} (NBAI), which are executed in parallel. PSBAI actively detects changepoints and aligns contexts to facilitate the arm identification process. When…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
