High Probability Bound for Cross-Learning Contextual Bandits with   Unknown Context Distributions

Ruiyuan Huang; Zengfeng Huang

arXiv:2410.04080·cs.LG·January 27, 2025

High Probability Bound for Cross-Learning Contextual Bandits with Unknown Context Distributions

Ruiyuan Huang, Zengfeng Huang

PDF

Open Access

TL;DR

This paper provides a high-probability regret bound for cross-learning contextual bandits with unknown context distributions, improving the understanding of algorithm performance beyond expected regret.

Contribution

It offers a novel high-probability analysis of Schneider and Zimmert's algorithm, utilizing new insights into epoch dependencies and refined martingale inequalities.

Findings

01

Achieves near-optimal high-probability regret bounds

02

Introduces new analysis techniques for epoch dependencies

03

Refines martingale inequalities for better bounds

Abstract

Motivated by applications in online bidding and sleeping bandits, we examine the problem of contextual bandits with cross learning, where the learner observes the loss associated with the action across all possible contexts, not just the current round's context. Our focus is on a setting where losses are chosen adversarially, and contexts are sampled i.i.d. from a specific distribution. This problem was first studied by Balseiro et al. (2019), who proposed an algorithm that achieves near-optimal regret under the assumption that the context distribution is known in advance. However, this assumption is often unrealistic. To address this issue, Schneider and Zimmert (2023) recently proposed a new algorithm that achieves nearly optimal expected regret. It is well-known that expected regret can be significantly weaker than high-probability bounds. In this paper, we present a novel, in-depth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Cognitive Radio Networks and Spectrum Sensing

MethodsFocus