Truncated LinUCB for Stochastic Linear Bandits

Yanglei Song; Meng zhou

arXiv:2202.11735·stat.ML·May 7, 2025

Truncated LinUCB for Stochastic Linear Bandits

Yanglei Song, Meng zhou

PDF

Open Access 2 Repos

TL;DR

This paper introduces Tr-LinUCB, a truncated version of LinUCB for stochastic linear bandits, which achieves near-optimal regret bounds by balancing exploration and exploitation, especially in low-dimensional settings.

Contribution

The paper proposes Tr-LinUCB, a truncation-based algorithm that improves regret bounds and demonstrates rate optimality in linear bandit problems.

Findings

01

Tr-LinUCB achieves $O(d ext{log}(T))$ regret with proper truncation.

02

A matching lower bound confirms the rate optimality of Tr-LinUCB.

03

The algorithm's performance is insensitive to the choice of truncation time in low dimensions.

Abstract

This paper considers contextual bandits with a finite number of arms, where the contexts are independent and identically distributed $d$ -dimensional random vectors, and the expected rewards are linear in both the arm parameters and contexts. The LinUCB algorithm, which is near minimax optimal for related linear bandits, is shown to have a cumulative regret that is suboptimal in both the dimension $d$ and time horizon $T$ , due to its over-exploration. A truncated version of LinUCB is proposed and termed "Tr-LinUCB", which follows LinUCB up to a truncation time $S$ and performs pure exploitation afterwards. The Tr-LinUCB algorithm is shown to achieve $O (d lo g (T))$ regret if $S = C d lo g (T)$ for a sufficiently large constant $C$ , and a matching lower bound is established, which shows the rate optimality of Tr-LinUCB in both $d$ and $T$ under a low dimensional regime. Further, if $S =…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems