Norm-Agnostic Linear Bandits

Spencer (Brady) Gales; Sunder Sethuraman; Kwang-Sung Jun

arXiv:2205.01257·stat.ML·May 4, 2022

Norm-Agnostic Linear Bandits

Spencer (Brady) Gales, Sunder Sethuraman, Kwang-Sung Jun

PDF

Open Access

TL;DR

This paper introduces new linear bandit algorithms that do not require prior knowledge of the parameter norm bound, maintaining low regret even when such bounds are unknown or incorrect.

Contribution

The paper presents the first algorithms for linear bandits that operate effectively without knowing the norm bound of the unknown parameter, with proven regret bounds.

Findings

01

Algorithms achieve low regret without prior norm knowledge.

02

Regret bounds are unaffected by the lack of norm bound knowledge.

03

Standard algorithms can fail catastrophically when the norm bound assumption is violated.

Abstract

Linear bandits have a wide variety of applications including recommendation systems yet they make one strong assumption: the algorithms must know an upper bound $S$ on the norm of the unknown parameter $θ^{*}$ that governs the reward generation. Such an assumption forces the practitioner to guess $S$ involved in the confidence bound, leaving no choice but to wish that $∥ θ^{*} ∥ \leq S$ is true to guarantee that the regret will be low. In this paper, we propose novel algorithms that do not require such knowledge for the first time. Specifically, we propose two algorithms and analyze their regret bounds: one for the changing arm set setting and the other for the fixed arm set setting. Our regret bound for the former shows that the price of not knowing $S$ does not affect the leading term in the regret bound and inflates only the lower order term. For the latter, we do not pay any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research