Avoiding the Price of Adaptivity: Inference in Linear Contextual Bandits via Stability

Samya Praharaj; Koulik Khamaru

arXiv:2512.20368·stat.ML·January 9, 2026

Avoiding the Price of Adaptivity: Inference in Linear Contextual Bandits via Stability

Samya Praharaj, Koulik Khamaru

PDF

Open Access

TL;DR

This paper introduces a stability-based approach for inference in linear contextual bandits, enabling valid confidence intervals without the usual adaptivity penalty, and demonstrates its effectiveness both theoretically and empirically.

Contribution

It proposes a regularized EXP4 algorithm that satisfies the stability condition, allowing for valid inference and near-optimal regret in adaptive linear bandit settings.

Findings

01

The proposed algorithm satisfies the Lai–Wei stability condition.

02

It provides asymptotically valid Wald-type confidence intervals.

03

It achieves near-minimax optimal regret up to logarithmic factors.

Abstract

Statistical inference in contextual bandits is challenging due to the adaptive, non-i.i.d. nature of the data. A growing body of work shows that classical least-squares inference can fail under adaptive sampling, and that valid confidence intervals for linear functionals typically require an inflation of order $d lo g T$ . This phenomenon -- often termed the price of adaptivity -- reflects the intrinsic difficulty of reliable inference under general contextual bandit policies. A key structural condition that overcomes this limitation is the stability condition of Lai and Wei, which requires the empirical feature covariance to converge to a deterministic limit. When stability holds, the ordinary least-squares estimator satisfies a central limit theorem, and classical Wald-type confidence intervals remain asymptotically valid under adaptation, without incurring the $d lo g T$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms