A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual   Bandit Problem

Zhiyuan Liu; Huazheng Wang; Bo Waggoner; Youjian (Eugene) Liu; Lijun; Chen

arXiv:2007.08561·cs.LG·July 20, 2020

A Smoothed Analysis of Online Lasso for the Sparse Linear Contextual Bandit Problem

Zhiyuan Liu, Huazheng Wang, Bo Waggoner, Youjian (Eugene) Liu, Lijun, Chen

PDF

Open Access

TL;DR

This paper analyzes the online Lasso algorithm's performance in sparse linear contextual bandits under adversarial contexts with small random perturbations, establishing regret bounds and highlighting the role of perturbations in exploration.

Contribution

It provides a novel regret analysis for online Lasso in adversarial settings with perturbations, avoiding preconditioning and truncation methods used previously.

Findings

01

Regret bound of O(√(kT log d)) even when d ≫ T

02

Analysis shows how perturbations influence exploration length

03

Numerical experiments validate theoretical results

Abstract

We investigate the sparse linear contextual bandit problem where the parameter $θ$ is sparse. To relieve the sampling inefficiency, we utilize the "perturbed adversary" where the context is generated adversarilly but with small random non-adaptive perturbations. We prove that the simple online Lasso supports sparse linear contextual bandit with regret bound $O (k T lo g d)$ even when $d ≫ T$ where $k$ and $d$ are the number of effective and ambient dimension, respectively. Compared to the recent work from Sivakumar et al. (2020), our analysis does not rely on the precondition processing, adaptive perturbation (the adaptive perturbation violates the i.i.d perturbation setting) or truncation on the error set. Moreover, the special structures in our results explicitly characterize how the perturbation affects exploration length, guide the design of perturbation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques