Sparse Optimistic Information Directed Sampling
Ludovic Schwartz, Hamish Flynn, Gergely Neu

TL;DR
This paper introduces SOIDS, an algorithm for high-dimensional sparse linear bandits that adaptively achieves optimal regret in both data-rich and data-poor regimes without Bayesian assumptions.
Contribution
The paper presents SOIDS, a novel sparse optimistic IDS algorithm that guarantees optimal worst-case regret in all regimes, extending theoretical analysis with a new time-dependent learning rate.
Findings
SOIDS achieves optimal worst-case regret in both regimes.
Empirical results show strong performance of SOIDS.
Theoretical analysis confirms adaptivity without Bayesian assumptions.
Abstract
Many high-dimensional online decision-making problems can be modeled as stochastic sparse linear bandits. Most existing algorithms are designed to achieve optimal worst-case regret in either the data-rich regime, where polynomial dependence on the ambient dimension is unavoidable, or the data-poor regime, where dimension-independence is possible at the cost of worse dependence on the number of rounds. In contrast, the sparse Information Directed Sampling (IDS) algorithm satisfies a Bayesian regret bound that has the optimal rate in both regimes simultaneously. In this work, we explore the use of Sparse Optimistic Information Directed Sampling (SOIDS) to achieve the same adaptivity in the worst-case setting, without Bayesian assumptions. Through a novel analysis that enables the use of a time-dependent learning rate, we show that SOIDS can optimally balance information and regret. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
