Minimax Rate-Optimal Algorithms for High-Dimensional Stochastic Linear Bandits
Jingyu Liu, Yanglei Song

TL;DR
This paper develops minimax rate-optimal algorithms for high-dimensional stochastic linear bandits with sparse parameters, demonstrating their superior performance over existing methods through theoretical regret bounds.
Contribution
It introduces a three-stage algorithm using thresholded Lasso for high-dimensional bandits, achieving minimax optimal regret bounds up to a logarithmic factor.
Findings
Thresholded Lasso achieves minimax rate in single-arm estimation.
The proposed bandit algorithm attains near-optimal regret bounds.
The method outperforms standard Lasso in sequential estimation.
Abstract
We study the stochastic linear bandit problem with multiple arms over rounds, where the covariate dimension may exceed , but each arm-specific parameter vector is -sparse. We begin by analyzing the sequential estimation problem in the single-arm setting, focusing on cumulative mean-squared error. We show that Lasso estimators are provably suboptimal in the sequential setting, exhibiting suboptimal dependence on and , whereas thresholded Lasso estimators -- obtained by applying least squares to the support selected by thresholding an initial Lasso estimator -- achieve the minimax rate. Building on these insights, we consider the full linear contextual bandit problem and propose a three-stage arm selection algorithm that uses thresholded Lasso as the main estimation method. We derive an upper bound on the cumulative regret of order , and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
