Federated Linear Dueling Bandits

Xuhan Huang; Yan Hu; Zhiyan Li; Zhiyong Wang; Benyou Wang; Zhongxiang Dai

arXiv:2502.01085·cs.LG·June 4, 2025

Federated Linear Dueling Bandits

Xuhan Huang, Yan Hu, Zhiyan Li, Zhiyong Wang, Benyou Wang, Zhongxiang Dai

PDF

Open Access 1 Video

TL;DR

This paper introduces FLDB-OGD, a federated learning algorithm for linear dueling bandits that combines online gradient descent with federated methods, addressing the lack of closed-form solutions in parameter estimation.

Contribution

The work develops the first federated linear dueling bandit algorithm using OGD, providing theoretical regret bounds and analyzing the trade-offs between regret and communication.

Findings

01

FLDB-OGD achieves sub-linear regret bounds.

02

More agents lead to improved performance.

03

There is a clear trade-off between regret and communication complexity.

Abstract

Contextual linear dueling bandits have recently garnered significant attention due to their widespread applications in important domains such as recommender systems and large language models. Classical dueling bandit algorithms are typically only applicable to a single agent. However, many applications of dueling bandits involve multiple agents who wish to collaborate for improved performance yet are unwilling to share their data. This motivates us to draw inspirations from federated learning, which involves multiple agents aiming to collaboratively train their neural networks via gradient descent (GD) without sharing their raw data. Previous works have developed federated linear bandit algorithms which rely on closed-form updates of the bandit parameters (e.g., the linear function parameters) to achieve collaboration. However, in linear dueling bandits, the linear function parameters…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Federated Linear Dueling Bandits· underline

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Cryptography and Data Security