Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit
Ke Li, Yun Yang, Naveen N. Narisetty

TL;DR
This paper establishes a minimax lower bound for regret in high-dimensional linear bandits, and introduces an efficient algorithm that matches this bound, adapting to unknown margin parameters.
Contribution
It provides the first unified regret lower bound for high-dimensional linear bandits and proposes a simple, adaptive algorithm that achieves this bound.
Findings
The lower bound depends on dimension, horizon, and margin parameter.
The proposed algorithm matches the theoretical lower bound.
Simulations confirm the algorithm's effectiveness.
Abstract
In this paper, we consider the multi-armed bandit problem with high-dimensional features. First, we prove a minimax lower bound, , for the cumulative regret, in terms of horizon , dimension and a margin parameter , which controls the separation between the optimal and the sub-optimal arms. This new lower bound unifies existing regret bound results that have different dependencies on T due to the use of different values of margin parameter explicitly implied by their assumptions. Second, we propose a simple and computationally efficient algorithm inspired by the general Upper Confidence Bound (UCB) strategy that achieves a regret upper bound matching the lower bound. The proposed algorithm uses a properly centered -ball as the confidence set in contrast to the commonly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
