Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards
Artin Tajdini, Jonathan Scarlett, Kevin Jamieson

TL;DR
None
Contribution
None
Abstract
We study stochastic linear bandits with heavy-tailed rewards, where the rewards have a finite -absolute central moment bounded by for some . We improve both upper and lower bounds on the minimax regret compared to prior work. When , the best prior known regret upper bound is . While a lower with the same scaling has been given, it relies on a construction using , and adapting the construction to the bounded-moment regime with yields only a lower bound. This matches the known rate for multi-armed bandits and is generally loose for linear bandits, in particular being below the optimal rate in the finite-variance case (). We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Age of Information Optimization
MethodsSparse Evolutionary Training
