Early Stopping in Contextual Bandits and Inferences

Zihan Cui (University of Michigan)

arXiv:2502.02793·math.ST·February 6, 2025

Early Stopping in Contextual Bandits and Inferences

Zihan Cui (University of Michigan)

PDF

Open Access

TL;DR

This paper introduces early stopping rules for linear contextual bandits to reduce sampling costs and improve decision-making, while enabling reliable post-experiment inferences based on online estimators.

Contribution

It develops new stopping rules based on Opportunity Cost and Threshold Methods, integrating variance-based regret bounds and asymptotic distributions for stable, adaptive decision processes.

Findings

01

Proposed stopping rules effectively minimize in-experiment regret.

02

Method enables robust online statistical inference after stopping.

03

Batched estimators improve stability and asymptotic analysis.

Abstract

Bandit algorithms sequentially accumulate data using adaptive sampling policies, offering flexibility for real-world applications. However, excessive sampling can be costly, motivating the devolopment of early stopping methods and reliable post-experiment conditional inferences. This paper studies early stopping methods in linear contextual bandits, including both pre-determined and online stopping rules, to minimize in-experiment regrets while accounting for sampling costs. We propose stopping rules based on the Opportunity Cost and Threshold Method, utilizing the variances of unbiased or consistent online estimators to quantify the upper regret bounds of learned optimal policy. The study focuses on batched settings for stability, selecting a weighed combination of batched estimators as the online estimator and deriving its asymptotic distribution. Online statistical inferences are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDecision-Making and Behavioral Economics

MethodsEarly Stopping