Federated Bandit: A Gossiping Approach
Zhaowei Zhu, Jingxuan Zhu, Ji Liu, Yang Liu

TL;DR
This paper introduces a decentralized federated multi-armed bandit algorithm that enables agents to collaboratively learn optimal actions through gossiping, while preserving privacy and achieving sublinear regret.
Contribution
It proposes Gossip_UCB, a novel decentralized bandit algorithm combining gossiping and UCB, and extends it to Fed_UCB, which ensures differential privacy with competitive regret bounds.
Findings
Gossip_UCB achieves regret of order O(max(poly(N,M) log T, poly(N,M) log_{λ_2^{-1}} N)).
Fed_UCB maintains differential privacy with regret bounds of order O(max(poly(N,M)/ε log^{2.5} T, poly(N,M)(log_{λ_2^{-1}} N + log T))).
The algorithms effectively enable decentralized, privacy-preserving multi-agent bandit learning with theoretical guarantees.
Abstract
In this paper, we study \emph{Federated Bandit}, a decentralized Multi-Armed Bandit problem with a set of agents, who can only communicate their local data with neighbors described by a connected graph . Each agent makes a sequence of decisions on selecting an arm from candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm Gossip_UCB, which is a coupling of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
