Tangential Randomization in Linear Bandits (TRAiL): Guaranteed Inference   and Regret Bounds

Arda G\"u\c{c}l\"u; Subhonmesh Bose

arXiv:2411.12154·stat.ML·November 20, 2024

Tangential Randomization in Linear Bandits (TRAiL): Guaranteed Inference and Regret Bounds

Arda G\"u\c{c}l\"u, Subhonmesh Bose

PDF

Open Access

TL;DR

This paper introduces TRAiL, a new efficient algorithm for linear bandits that guarantees near-optimal regret bounds and provides insights into the trade-off between inference quality and regret growth.

Contribution

TRAiL is a novel, computationally efficient exploration algorithm with proven regret and inference bounds, expanding understanding of the trade-offs in linear bandit problems.

Findings

01

TRAiL achieves an $ ilde{O}( oot{T})$ regret bound.

02

A new minimax lower bound for linear bandits is established.

03

Trade-off between regret growth and inference quality is characterized.

Abstract

We propose and analyze TRAiL (Tangential Randomization in Linear Bandits), a computationally efficient regret-optimal forced exploration algorithm for linear bandits on action sets that are sublevel sets of strongly convex functions. TRAiL estimates the governing parameter of the linear bandit problem through a standard regularized least squares and perturbs the reward-maximizing action corresponding to said point estimate along the tangent plane of the convex compact action set before projecting back to it. Exploiting concentration results for matrix martingales, we prove that TRAiL ensures a $Ω (T)$ growth in the inference quality, measured via the minimum eigenvalue of the design (regressor) matrix with high-probability over a $T$ -length period. We build on this result to obtain an $O (T lo g (T))$ upper bound on cumulative regret with probability at least $…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Data Stream Mining Techniques

MethodsSparse Evolutionary Training