Valid Post-Contextual Bandit Inference
Ramon van den Akker, Bas J.M. Werker, and Bo Zhou

TL;DR
This paper develops an asymptotic framework for statistical inference in the contextual multi-armed bandit setting, providing new theoretical tools and distributions for common tests under adaptive sampling schemes.
Contribution
It introduces a novel asymptotic analysis using limit experiments and stochastic differential equations for CMAB, enabling valid inference with adaptively collected data.
Findings
Derived asymptotic distributions for t-test, weighted, and inverse propensity weighted tests.
Identified conditions for validity of tests under translation-invariant sampling schemes.
Proposed translation-invariant variants of popular bandit algorithms.
Abstract
We establish an asymptotic framework for the statistical analysis of the stochastic contextual multi-armed bandit problem (CMAB), which is widely employed in adaptively randomized experiments across various fields. While algorithms for maximizing rewards or, equivalently, minimizing regret have received considerable attention, our focus centers on statistical inference with adaptively collected data under the CMAB model. To this end we derive the limit experiment (in the Hajek-Le Cam sense). This limit experiment is highly nonstandard and, applying Girsanov's theorem, we obtain a structural representation in terms of stochastic differential equations. This structural representation, and a general weak convergence result we develop, allow us to obtain the asymptotic distribution of statistics for the CMAB problem. In particular, we obtain the asymptotic distributions for the classical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDecision-Making and Behavioral Economics
MethodsFocus · Class-activation map
