Efficient Inference Without Trading-off Regret in Bandits: An Allocation   Probability Test for Thompson Sampling

Nina Deliu; Joseph J. Williams; Sofia S. Villar

arXiv:2111.00137·stat.ML·November 2, 2021·1 cites

Efficient Inference Without Trading-off Regret in Bandits: An Allocation Probability Test for Thompson Sampling

Nina Deliu, Joseph J. Williams, Sofia S. Villar

PDF

Open Access

TL;DR

This paper introduces a new hypothesis test based on allocation probabilities for Thompson Sampling bandits, enabling reliable inference without sacrificing regret minimization or requiring large sample sizes, especially beneficial for small experiments.

Contribution

The paper presents the Allocation Probability Test, a novel inference method for bandit algorithms that does not restrict exploitative behavior or need large samples, with proven theoretical and practical advantages.

Findings

01

The test maintains valid inference in small samples.

02

It outperforms existing methods in finite-sample scenarios.

03

It demonstrates reduced regret and improved statistical power.

Abstract

Using bandit algorithms to conduct adaptive randomised experiments can minimise regret, but it poses major challenges for statistical inference (e.g., biased estimators, inflated type-I error and reduced power). Recent attempts to address these challenges typically impose restrictions on the exploitative nature of the bandit algorithm $-$ trading off regret $-$ and require large sample sizes to ensure asymptotic guarantees. However, large experiments generally follow a successful pilot study, which is tightly constrained in its size or duration. Increasing power in such small pilot experiments, without limiting the adaptive nature of the algorithm, can allow promising interventions to reach a larger experimental phase. In this work we introduce a novel hypothesis test, uniquely based on the allocation probabilities of the bandit algorithm, and without constraining its exploitative nature or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Mobile Crowdsensing and Crowdsourcing