Squeeze All: Novel Estimator and Self-Normalized Bound for Linear   Contextual Bandits

Wonyoung Kim; Myunghee Cho Paik; Min-hwan Oh

arXiv:2206.05404·stat.ML·March 30, 2023

Squeeze All: Novel Estimator and Self-Normalized Bound for Linear Contextual Bandits

Wonyoung Kim, Myunghee Cho Paik, Min-hwan Oh

PDF

Open Access

TL;DR

This paper introduces a new linear contextual bandit algorithm with a novel estimator and self-normalized bounds, achieving near-optimal regret and outperforming existing methods in experiments.

Contribution

The paper presents a novel estimator with embedded exploration and a self-normalized bound, leading to improved regret analysis for linear bandits.

Findings

01

Achieves $O(\sqrt{dT\log T})$ regret bound.

02

Establishes a new lower bound of $\Omega(\sqrt{dT})$ for the problem.

03

Numerical experiments demonstrate superior performance over existing algorithms.

Abstract

We propose a linear contextual bandit algorithm with $O (d T lo g T)$ regret bound, where $d$ is the dimension of contexts and $T$ isthe time horizon. Our proposed algorithm is equipped with a novel estimator in which exploration is embedded through explicit randomization. Depending on the randomization, our proposed estimator takes contributions either from contexts of all arms or from selected contexts. We establish a self-normalized bound for our estimator, which allows a novel decomposition of the cumulative regret into \textit{additive} dimension-dependent terms instead of multiplicative terms. We also prove a novel lower bound of $Ω (d T)$ under our problem setting. Hence, the regret of our proposed algorithm matches the lower bound up to logarithmic factors. The numerical experiments support the theoretical guarantees and show that our proposed method outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques