Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling
Nina Deliu, Joseph J. Williams, Sofia S. Villar

TL;DR
This paper introduces a new hypothesis test based on allocation probabilities for Thompson Sampling bandits, enabling reliable inference without sacrificing regret minimization or requiring large sample sizes, especially beneficial for small experiments.
Contribution
The paper presents the Allocation Probability Test, a novel inference method for bandit algorithms that does not restrict exploitative behavior or need large samples, with proven theoretical and practical advantages.
Findings
The test maintains valid inference in small samples.
It outperforms existing methods in finite-sample scenarios.
It demonstrates reduced regret and improved statistical power.
Abstract
Using bandit algorithms to conduct adaptive randomised experiments can minimise regret, but it poses major challenges for statistical inference (e.g., biased estimators, inflated type-I error and reduced power). Recent attempts to address these challenges typically impose restrictions on the exploitative nature of the bandit algorithmtrading off regretand require large sample sizes to ensure asymptotic guarantees. However, large experiments generally follow a successful pilot study, which is tightly constrained in its size or duration. Increasing power in such small pilot experiments, without limiting the adaptive nature of the algorithm, can allow promising interventions to reach a larger experimental phase. In this work we introduce a novel hypothesis test, uniquely based on the allocation probabilities of the bandit algorithm, and without constraining its exploitative nature or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing
