Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual   Linear Bandit

Ke Li; Yun Yang; Naveen N. Narisetty

arXiv:2109.11612·cs.LG·September 27, 2021

Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit

Ke Li, Yun Yang, Naveen N. Narisetty

PDF

Open Access

TL;DR

This paper establishes a minimax lower bound for regret in high-dimensional linear bandits, and introduces an efficient algorithm that matches this bound, adapting to unknown margin parameters.

Contribution

It provides the first unified regret lower bound for high-dimensional linear bandits and proposes a simple, adaptive algorithm that achieves this bound.

Findings

01

The lower bound depends on dimension, horizon, and margin parameter.

02

The proposed algorithm matches the theoretical lower bound.

03

Simulations confirm the algorithm's effectiveness.

Abstract

In this paper, we consider the multi-armed bandit problem with high-dimensional features. First, we prove a minimax lower bound, $O ((lo g d)^{\frac{α + 1}{2}} T^{\frac{1 - α}{2}} + lo g T)$ , for the cumulative regret, in terms of horizon $T$ , dimension $d$ and a margin parameter $α \in [0, 1]$ , which controls the separation between the optimal and the sub-optimal arms. This new lower bound unifies existing regret bound results that have different dependencies on T due to the use of different values of margin parameter $α$ explicitly implied by their assumptions. Second, we propose a simple and computationally efficient algorithm inspired by the general Upper Confidence Bound (UCB) strategy that achieves a regret upper bound matching the lower bound. The proposed algorithm uses a properly centered $ℓ_{1}$ -ball as the confidence set in contrast to the commonly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms