Linear Contextual Bandits with Hybrid Payoff: Revisited
Nirjhar Das, Gaurav Sinha

TL;DR
This paper revisits the linear contextual bandit problem with hybrid rewards, introducing a new algorithm HyLinUCB that improves regret bounds and performs well empirically across various settings, especially with many arm-specific parameters.
Contribution
The paper provides new regret analyses for existing algorithms under hybrid rewards, and introduces HyLinUCB, a modified algorithm with improved theoretical guarantees and empirical performance.
Findings
HyLinUCB achieves $O( oot T)$ regret under certain conditions.
DisLinUCB performs best when many arm-specific parameters are present.
HyLinUCB's regret grows slower with the number of arms compared to baselines.
Abstract
We study the Linear Contextual Bandit problem in the hybrid reward setting. In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms. We can reduce this setting to two closely related settings (a) Shared - no arm specific parameters, and (b) Disjoint - only arm specific parameters, enabling the application of two popular state of the art algorithms - and (Algorithm 1 in (Li et al. 2010)). When the arm features are stochastic and satisfy a popular diversity condition, we provide new regret analyses for both algorithms, significantly improving on the known regret guarantees of these algorithms. Our novel analysis critically exploits the hybrid reward structure and the diversity condition. Moreover, we introduce a new algorithm that crucially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Cognitive Radio Networks and Spectrum Sensing
