One More Step Towards Reality: Cooperative Bandits with Imperfect Communication
Udari Madhushani, Abhimanyu Dubey, Naomi Ehrich Leonard, Alex Pentland

TL;DR
This paper advances cooperative bandit algorithms to operate effectively over imperfect, stochastic, and adversarial communication networks, providing near-optimal regret guarantees and improved algorithms for perfect communication scenarios.
Contribution
It introduces decentralized algorithms for cooperative bandits under various imperfect communication models with theoretical guarantees and improved methods for perfect communication.
Findings
Algorithms achieve near-optimal group regret in stochastic and adversarial networks.
Proposed methods outperform existing algorithms in empirical evaluations.
Tight lower bounds established for network-dependent regret.
Abstract
The cooperative bandit problem is increasingly becoming relevant due to its applications in large-scale decision-making. However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays. In this paper, we study cooperative bandit learning under three typical real-world communication scenarios, namely, (a) message-passing over stochastic time-varying networks, (b) instantaneous reward-sharing over a network with random delays, and (c) message-passing with adversarially corrupted rewards, including byzantine communication. For each of these environments, we propose decentralized algorithms that achieve competitive performance, along with near-optimal guarantees on the incurred group regret as well. Furthermore, in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Age of Information Optimization
