Stable Thompson Sampling: Valid Inference via Variance Inflation
Budhaditya Halder, Shubhayan Pan, Koulik Khamaru

TL;DR
This paper introduces Stable Thompson Sampling, a variant that inflates variance to enable valid inference while maintaining near-optimal regret, addressing the challenge of confidence interval construction in adaptive sampling.
Contribution
It proposes a variance-inflated version of Thompson Sampling that achieves asymptotically normal estimates, facilitating valid statistical inference in adaptive data collection.
Findings
Asymptotic normality of estimates achieved with variance inflation
Regret increases only logarithmically compared to standard TS
Provides a principled trade-off between inference validity and regret
Abstract
We consider the problem of statistical inference when the data is collected via a Thompson Sampling-type algorithm. While Thompson Sampling (TS) is known to be both asymptotically optimal and empirically effective, its adaptive sampling scheme poses challenges for constructing confidence intervals for model parameters. We propose and analyze a variant of TS, called Stable Thompson Sampling, in which the posterior variance is inflated by a logarithmic factor. We show that this modification leads to asymptotically normal estimates of the arm means, despite the non-i.i.d. nature of the data. Importantly, this statistical benefit comes at a modest cost: the variance inflation increases regret by only a logarithmic factor compared to standard TS. Our results reveal a principled trade-off: by paying a small price in regret, one can enable valid statistical inference for adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Statistical Methods and Bayesian Inference · SARS-CoV-2 detection and testing
MethodsSpatio-temporal stability analysis
