The Impact of Batch Learning in Stochastic Linear Bandits
Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

TL;DR
This paper analyzes how batch learning affects regret in stochastic linear bandits, providing bounds and insights for different arm settings, supported by empirical experiments to guide optimal batch size selection.
Contribution
It offers a policy-agnostic regret analysis for batch learning in stochastic linear bandits, including bounds and insights for finite and infinite arm scenarios.
Findings
Batch size scales the regret multiplicatively.
Regret bounds are similar for finite and infinite arms.
Empirical results support theoretical insights and optimal batch size choice.
Abstract
We consider a special case of bandit problems, named batched bandits, in which an agent observes batches of responses over a certain time period. Unlike previous work, we consider a more practically relevant batch-centric scenario of batch learning. That is to say, we provide a policy-agnostic regret analysis and demonstrate upper and lower bounds for the regret of a candidate policy. Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior. Primarily, we study two settings of the stochastic linear bandits: bandits with finitely and infinitely many arms. While the regret bounds are the same for both settings, the former setting results hold under milder assumptions. Also, we provide a more robust result for the 2-armed bandit problem as an important insight. Finally, we demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing
