Sharp analysis of linear ensemble sampling
Arya Akhavan, David Janz, Csaba Szepesv\'ari

TL;DR
This paper provides a detailed analysis of linear ensemble sampling in stochastic linear bandits, demonstrating near-optimal regret bounds with computational efficiency, and introduces a novel continuous-time perspective for understanding randomized exploration.
Contribution
It offers the first sharp regret analysis of linear ensemble sampling using a continuous-time Brownian motion approach, bridging the gap to Thompson sampling.
Findings
Achieves $ ilde O(d^{3/2}rac{1}{ oot n})$ regret with ensemble size $m= heta(d ext{log} n)$
Introduces a new perspective by reducing analysis to a time-uniform exceedance problem for Brownian motions
Shows that continuous-time analysis is natural and perhaps necessary for sharp bounds in ensemble sampling
Abstract
We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size , ES attains high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for independent Brownian motions. Intriguingly, this continuous-time lens is not forced; it appears natural--and perhaps necessary: the discrete-time problem seems to be asking for a continuous-time solution, and we know of no other way to obtain a sharp ES bound.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Risk and Portfolio Optimization
