Optimal Batched Linear Bandits

Xuanfei Ren; Tianyuan Jin; Pan Xu

arXiv:2406.04137·cs.LG·June 7, 2024

Optimal Batched Linear Bandits

Xuanfei Ren, Tianyuan Jin, Pan Xu

PDF

Open Access 1 Repo

TL;DR

The paper introduces E$^4$, an optimal batched linear bandit algorithm that achieves minimax and asymptotic regret bounds with minimal batch complexity, and demonstrates superior empirical performance.

Contribution

E$^4$ is the first algorithm to simultaneously attain minimax and asymptotic optimality in regret with optimal batch complexities in linear bandits.

Findings

01

Achieves finite-time minimax optimal regret with O(log log T) batches.

02

Achieves asymptotically optimal regret with only 3 batches as T→∞.

03

Outperforms baseline algorithms in experiments on challenging instances.

Abstract

We introduce the E $^{4}$ algorithm for the batched linear bandit problem, incorporating an Explore-Estimate-Eliminate-Exploit framework. With a proper choice of exploration rate, we prove E $^{4}$ achieves the finite-time minimax optimal regret with only $O (lo g lo g T)$ batches, and the asymptotically optimal regret with only $3$ batches as $T \to \infty$ , where $T$ is the time horizon. We further prove a lower bound on the batch complexity of linear contextual bandits showing that any asymptotically optimal algorithm must require at least $3$ batches in expectation as $T \to \infty$ , which indicates E $^{4}$ achieves the asymptotic optimality in regret and batch complexity simultaneously. To the best of our knowledge, E $^{4}$ is the first algorithm for linear bandits that simultaneously achieves the minimax and asymptotic optimality in regret with the corresponding optimal batch…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

panxulab/optimal-batched-linear-bandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Machine Learning and Algorithms