Nonstochastic Bandits and Experts with Arm-Dependent Delays
Dirk van der Hoeven, Nicol\`o Cesa-Bianchi

TL;DR
This paper introduces new algorithms and regret bounds for nonstochastic bandits and experts with arm-dependent delays, addressing technical challenges and providing bounds that depend on the best arm’s losses and delays.
Contribution
It presents the first regret bounds for delayed nonstochastic bandits and experts that depend solely on the best arm’s losses and delays, including a novel drift bound.
Findings
First-order regret bounds depending on best arm's losses and delays
Algorithms for both full information and bandit settings with arm-dependent delays
Extension of existing algorithms to handle arm-dependent delays effectively
Abstract
We study nonstochastic bandits and experts in a delayed setting where delays depend on both time and arms. While the setting in which delays only depend on time has been extensively studied, the arm-dependent delay setting better captures real-world applications at the cost of introducing new technical challenges. In the full information (experts) setting, we design an algorithm with a first-order regret bound that reveals an interesting trade-off between delays and losses. We prove a similar first-order regret bound also for the bandit setting, when the learner is allowed to observe how many losses are missing. These are the first bounds in the delayed setting that depend on the losses and delays of the best arm only. When in the bandit setting no information other than the losses is observed, we still manage to prove a regret bound through a modification to the algorithm of Zimmert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Smart Grid Energy Management
