Federated Linear Dueling Bandits
Xuhan Huang, Yan Hu, Zhiyan Li, Zhiyong Wang, Benyou Wang, Zhongxiang Dai

TL;DR
This paper introduces FLDB-OGD, a federated learning algorithm for linear dueling bandits that combines online gradient descent with federated methods, addressing the lack of closed-form solutions in parameter estimation.
Contribution
The work develops the first federated linear dueling bandit algorithm using OGD, providing theoretical regret bounds and analyzing the trade-offs between regret and communication.
Findings
FLDB-OGD achieves sub-linear regret bounds.
More agents lead to improved performance.
There is a clear trade-off between regret and communication complexity.
Abstract
Contextual linear dueling bandits have recently garnered significant attention due to their widespread applications in important domains such as recommender systems and large language models. Classical dueling bandit algorithms are typically only applicable to a single agent. However, many applications of dueling bandits involve multiple agents who wish to collaborate for improved performance yet are unwilling to share their data. This motivates us to draw inspirations from federated learning, which involves multiple agents aiming to collaboratively train their neural networks via gradient descent (GD) without sharing their raw data. Previous works have developed federated linear bandit algorithms which rely on closed-form updates of the bandit parameters (e.g., the linear function parameters) to achieve collaboration. However, in linear dueling bandits, the linear function parameters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Cryptography and Data Security
