TL;DR
This paper introduces BLAE, a batched linear bandit algorithm that achieves minimax optimal regret with minimal batches, combining theoretical guarantees with practical efficiency and superior empirical performance.
Contribution
BLAE is the first batched linear bandit algorithm to attain minimax optimal regret across all regimes while maintaining low computational overhead and strong empirical results.
Findings
Achieves minimax optimal regret up to logarithmic factors in T.
Uses only O(log log T) batches, significantly fewer than previous methods.
Outperforms state-of-the-art algorithms in extensive numerical evaluations.
Abstract
We study the linear bandit problem under limited adaptivity, known as the batched linear bandit. While existing approaches can achieve near-optimal regret in theory, they are often computationally prohibitive or underperform in practice. We propose BLAE, a novel batched algorithm that integrates arm elimination with regularized G-optimal design, achieving the minimax optimal regret (up to logarithmic factors in ) in both large- and small- regimes for the first time, while using only batches. Our analysis introduces new techniques for batch-wise optimal design and refined concentration bounds. Crucially, BLAE demonstrates low computational overhead and strong empirical performance, outperforming state-of-the-art methods in extensive numerical evaluations. Thus, BLAE is the first algorithm to combine provable minimax-optimality in all regimes and practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
