The Impact of Batch Learning in Stochastic Linear Bandits

Danil Provodin; Pratik Gajane; Mykola Pechenizkiy; Maurits Kaptein

arXiv:2202.06657·cs.LG·April 4, 2023

The Impact of Batch Learning in Stochastic Linear Bandits

Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

PDF

Open Access 1 Repo

TL;DR

This paper analyzes how batch learning affects regret in stochastic linear bandits, providing bounds and insights for different arm settings, supported by empirical experiments to guide optimal batch size selection.

Contribution

It offers a policy-agnostic regret analysis for batch learning in stochastic linear bandits, including bounds and insights for finite and infinite arm scenarios.

Findings

01

Batch size scales the regret multiplicatively.

02

Regret bounds are similar for finite and infinite arms.

03

Empirical results support theoretical insights and optimal batch size choice.

Abstract

We consider a special case of bandit problems, named batched bandits, in which an agent observes batches of responses over a certain time period. Unlike previous work, we consider a more practically relevant batch-centric scenario of batch learning. That is to say, we provide a policy-agnostic regret analysis and demonstrate upper and lower bounds for the regret of a candidate policy. Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior. Primarily, we study two settings of the stochastic linear bandits: bandits with finitely and infinitely many arms. While the regret bounds are the same for both settings, the former setting results hold under milder assumptions. Also, we provide a more robust result for the 2-armed bandit problem as an important insight. Finally, we demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

danilprov/batch-bandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing