Efficient Inference after Directionally Stable Adaptive Experiments
Zikai Shen, Houssam Zenati, Nathan Kallus, Arthur Gretton, Koulik Khamaru, Aur\'elien Bibaut

TL;DR
This paper introduces the concept of directional stability for adaptive experiments, enabling valid inference and efficiency guarantees for estimators derived from adaptively collected data, exemplified by LinUCB.
Contribution
It proposes a weaker stability condition called directional stability, ensuring asymptotic normality and efficiency of estimators after adaptive data collection.
Findings
Directional stability guarantees estimator efficiency under adaptive sampling.
The canonical gradient has a martingale form with stabilized quadratic variation.
LinUCB satisfies directional stability, achieving semiparametric efficiency.
Abstract
We study inference on scalar-valued pathwise differentiable targets after adaptive data collection, such as a bandit algorithm. We introduce a novel target-specific condition, directional stability, which is strictly weaker than previously imposed target-agnostic stability conditions. Under directional stability, we show that estimators that would have been efficient under i.i.d. data remain asymptotically normal and semiparametrically efficient when computed from adaptively collected trajectories. The canonical gradient has a martingale form, and directional stability guarantees stabilization of its predictable quadratic variation, enabling high-dimensional asymptotic normality. We characterize efficiency using a convolution theorem for the adaptive-data setting, and give a condition under which the one-step estimator attains the efficiency bound. We verify directional stability for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Stochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research
