Federated Bandit: A Gossiping Approach

Zhaowei Zhu; Jingxuan Zhu; Ji Liu; Yang Liu

arXiv:2010.12763·cs.LG·April 8, 2021

Federated Bandit: A Gossiping Approach

Zhaowei Zhu, Jingxuan Zhu, Ji Liu, Yang Liu

PDF

TL;DR

This paper introduces a decentralized federated multi-armed bandit algorithm that enables agents to collaboratively learn optimal actions through gossiping, while preserving privacy and achieving sublinear regret.

Contribution

It proposes Gossip_UCB, a novel decentralized bandit algorithm combining gossiping and UCB, and extends it to Fed_UCB, which ensures differential privacy with competitive regret bounds.

Findings

01

Gossip_UCB achieves regret of order O(max(poly(N,M) log T, poly(N,M) log_{λ_2^{-1}} N)).

02

Fed_UCB maintains differential privacy with regret bounds of order O(max(poly(N,M)/ε log^{2.5} T, poly(N,M)(log_{λ_2^{-1}} N + log T))).

03

The algorithms effectively enable decentralized, privacy-preserving multi-agent bandit learning with theoretical guarantees.

Abstract

In this paper, we study \emph{Federated Bandit}, a decentralized Multi-Armed Bandit problem with a set of $N$ agents, who can only communicate their local data with neighbors described by a connected graph $G$ . Each agent makes a sequence of decisions on selecting an arm from $M$ candidates, yet they only have access to local and potentially biased feedback/evaluation of the true reward for each action taken. Learning only locally will lead agents to sub-optimal actions while converging to a no-regret strategy requires a collection of distributed data. Motivated by the proposal of federated learning, we aim for a solution with which agents will never share their local observations with a central entity, and will be allowed to only share a private copy of his/her own information with their neighbors. We first propose a decentralized bandit algorithm Gossip_UCB, which is a coupling of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.