Learning to Sparsify Stochastic Linear Bandits
Zhengmiao Wang,Ming Chi,Zhi-Wei Liu,Lintao Ye,Carla Fabiana Chiasserini

TL;DR
This paper introduces an adaptive algorithm for learning sparse actions in stochastic linear bandits, achieving near-optimal regret bounds and validated through experiments including recommendation systems.
Contribution
It proposes a novel phased exploration-exploitation framework for sparse linear bandits, with efficient algorithms for Euclidean balls and greedy approaches for general convex sets.
Findings
Achieves $ ilde{O}(d\,\sqrt{T})$ regret for Euclidean ball action sets.
Develops regret bounds for strongly convex and general convex sets.
Validates algorithms through extensive experiments, including recommendation systems.
Abstract
This paper addresses the problem of learning to sparsify stochastic linear bandits, where a decision-maker sequentially selects actions from a high-dimensional space subject to a sparsity constraint on the number of nonzero elements in the action vector. The key challenge lies in minimizing cumulative regret while tackling the potential NP-hardness of finding optimal sparse actions due to the inherent combinatorial structure of the problem. We propose an adaptively phased exploration and exploitation algorithmic framework, utilizing ordinary least squares for parameter learning and specialized subroutines for sparse action selection. When the action set is a Euclidean ball, optimal sparse actions can be efficiently computed, enabling us to establish a regret, where is the dimension of the action vector and is the time horizon length. For general…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
