Statistical Inference on Multi-armed Bandits with Delayed Feedback
Lei Shi, Jingshen Wang, Tianhao Wu

TL;DR
This paper develops a statistical inference framework for multi-armed bandit policies with delayed feedback, enabling valid uncertainty quantification and policy evaluation in complex, real-world scenarios.
Contribution
It introduces an adaptively weighted estimator that accounts for arm-dependent delays and does not rely on estimating the delay mechanism, with proven asymptotic normality.
Findings
Estimator achieves consistency under delay conditions.
Finite-sample performance is validated via Monte Carlo simulations.
Provides asymptotic normality guarantees for large samples.
Abstract
Multi armed bandit (MAB) algorithms have been increasingly used to complement or integrate with A/B tests and randomized clinical trials in e-commerce, healthcare, and policymaking. Recent developments incorporate possible delayed feedback. While existing MAB literature often focuses on maximizing the expected cumulative reward outcomes (or, equivalently, regret minimization), few efforts have been devoted to establish valid statistical inference approaches to quantify the uncertainty of learned policies. We attempt to fill this gap by providing a unified statistical inference framework for policy evaluation where a target policy is allowed to differ from the data collecting policy, and our framework allows delay to be associated with the treatment arms. We present an adaptively weighted estimator that on one hand incorporates the arm-dependent delaying mechanism to achieve consistency,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Healthcare Operations and Scheduling Optimization · Advanced Causal Inference Techniques
