Avoiding the Price of Adaptivity: Inference in Linear Contextual Bandits via Stability
Samya Praharaj, Koulik Khamaru

TL;DR
This paper introduces a stability-based approach for inference in linear contextual bandits, enabling valid confidence intervals without the usual adaptivity penalty, and demonstrates its effectiveness both theoretically and empirically.
Contribution
It proposes a regularized EXP4 algorithm that satisfies the stability condition, allowing for valid inference and near-optimal regret in adaptive linear bandit settings.
Findings
The proposed algorithm satisfies the Lai–Wei stability condition.
It provides asymptotically valid Wald-type confidence intervals.
It achieves near-minimax optimal regret up to logarithmic factors.
Abstract
Statistical inference in contextual bandits is challenging due to the adaptive, non-i.i.d. nature of the data. A growing body of work shows that classical least-squares inference can fail under adaptive sampling, and that valid confidence intervals for linear functionals typically require an inflation of order . This phenomenon -- often termed the price of adaptivity -- reflects the intrinsic difficulty of reliable inference under general contextual bandit policies. A key structural condition that overcomes this limitation is the stability condition of Lai and Wei, which requires the empirical feature covariance to converge to a deterministic limit. When stability holds, the ordinary least-squares estimator satisfies a central limit theorem, and classical Wald-type confidence intervals remain asymptotically valid under adaptation, without incurring the …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
