A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual Bandit Problem
Zhiyuan Liu, Huazheng Wang, Bo Waggoner, Youjian (Eugene) Liu, Lijun, Chen

TL;DR
This paper analyzes the online Lasso algorithm's performance in sparse linear contextual bandits under adversarial contexts with small random perturbations, establishing regret bounds and highlighting the role of perturbations in exploration.
Contribution
It provides a novel regret analysis for online Lasso in adversarial settings with perturbations, avoiding preconditioning and truncation methods used previously.
Findings
Regret bound of O(√(kT log d)) even when d ≫ T
Analysis shows how perturbations influence exploration length
Numerical experiments validate theoretical results
Abstract
We investigate the sparse linear contextual bandit problem where the parameter is sparse. To relieve the sampling inefficiency, we utilize the "perturbed adversary" where the context is generated adversarilly but with small random non-adaptive perturbations. We prove that the simple online Lasso supports sparse linear contextual bandit with regret bound even when where and are the number of effective and ambient dimension, respectively. Compared to the recent work from Sivakumar et al. (2020), our analysis does not rely on the precondition processing, adaptive perturbation (the adaptive perturbation violates the i.i.d perturbation setting) or truncation on the error set. Moreover, the special structures in our results explicitly characterize how the perturbation affects exploration length, guide the design of perturbation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques
