Why Adaptively Collected Data Have Negative Bias and How to Correct for It
Xinkun Nie, Xiaoying Tian, Jonathan Taylor, James Zou

TL;DR
This paper demonstrates that adaptively collected data tend to have a negative bias in sample means and introduces a debiasing method to correct this, improving the accuracy of estimated effects.
Contribution
The paper proves the existence of negative bias in adaptively collected data and proposes a novel debiasing algorithm based on selective inference techniques.
Findings
Sample means are systematically negatively biased under adaptive data collection.
The proposed debiasing algorithm effectively reduces bias and estimation error.
The negative bias magnitude varies across different adaptive settings.
Abstract
From scientific experiments to online A/B testing, the previously observed data often affects how future experiments are performed, which in turn affects which data will be collected. Such adaptivity introduces complex correlations between the data and the collection procedure. In this paper, we prove that when the data collection procedure satisfies natural conditions, then sample means of the data have systematic \emph{negative} biases. As an example, consider an adaptive clinical trial where additional data points are more likely to be tested for treatments that show initial promise. Our surprising result implies that the average observed treatment effects would underestimate the true effects of each treatment. We quantitatively analyze the magnitude and behavior of this negative bias in a variety of settings. We also propose a novel debiasing algorithm based on selective inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Causal Inference Techniques · Statistical Methods in Clinical Trials
