Linear Contextual Bandits with Hybrid Payoff: Revisited

Nirjhar Das; Gaurav Sinha

arXiv:2406.10131·cs.LG·September 5, 2024

Linear Contextual Bandits with Hybrid Payoff: Revisited

Nirjhar Das, Gaurav Sinha

PDF

Open Access 1 Repo

TL;DR

This paper revisits the linear contextual bandit problem with hybrid rewards, introducing a new algorithm HyLinUCB that improves regret bounds and performs well empirically across various settings, especially with many arm-specific parameters.

Contribution

The paper provides new regret analyses for existing algorithms under hybrid rewards, and introduces HyLinUCB, a modified algorithm with improved theoretical guarantees and empirical performance.

Findings

01

HyLinUCB achieves $O( oot T)$ regret under certain conditions.

02

DisLinUCB performs best when many arm-specific parameters are present.

03

HyLinUCB's regret grows slower with the number of arms compared to baselines.

Abstract

We study the Linear Contextual Bandit problem in the hybrid reward setting. In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms. We can reduce this setting to two closely related settings (a) Shared - no arm specific parameters, and (b) Disjoint - only arm specific parameters, enabling the application of two popular state of the art algorithms - $LinUCB$ and $DisLinUCB$ (Algorithm 1 in (Li et al. 2010)). When the arm features are stochastic and satisfy a popular diversity condition, we provide new regret analyses for both algorithms, significantly improving on the known regret guarantees of these algorithms. Our novel analysis critically exploits the hybrid reward structure and the diversity condition. Moreover, we introduce a new algorithm $HyLinUCB$ that crucially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nirjhar-das/hypay_bandits
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Cognitive Radio Networks and Spectrum Sensing