Doubly Adversarial Federated Bandits
Jialin Yi, Milan Vojnovi\'c

TL;DR
This paper introduces a new federated multi-armed bandit problem with a doubly adversarial setting, providing theoretical regret bounds and a near-optimal algorithm that operates without sharing detailed feedback among agents.
Contribution
It formulates the doubly adversarial federated bandit problem, derives regret lower bounds, and proposes FEDEXP3, a near-optimal algorithm that works with limited communication.
Findings
FEDEXP3 achieves sub-linear regret in the bandit feedback setting.
Theoretical regret bounds are established for various feedback scenarios.
Numerical experiments validate the effectiveness of FEDEXP3 on synthetic and real data.
Abstract
We study a new non-stochastic federated multi-armed bandit problem with multiple agents collaborating via a communication network. The losses of the arms are assigned by an oblivious adversary that specifies the loss of each arm not only for each time step but also for each agent, which we call ``doubly adversarial". In this setting, different agents may choose the same arm in the same time step but observe different feedback. The goal of each agent is to find a globally best arm in hindsight that has the lowest cumulative loss averaged over all agents, which necessities the communication among agents. We provide regret lower bounds for any federated bandit algorithm under different settings, when agents have access to full-information feedback, or the bandit feedback. For the bandit feedback setting, we propose a near-optimal federated bandit algorithm called FEDEXP3. Our algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Age of Information Optimization · Privacy-Preserving Technologies in Data
