Federated Combinatorial Multi-Agent Multi-Armed Bandits
Fares Fourati, Mohamed-Slim Alouini, Vaneet Aggarwal

TL;DR
This paper develops a federated learning framework for multi-agent combinatorial bandits that improves regret bounds, reduces communication, and applies to submodular maximization, validated through empirical experiments.
Contribution
It transforms offline approximation algorithms into online multi-agent algorithms with improved regret bounds and communication efficiency, applicable to submodular maximization.
Findings
Achieves sublinear regret growth with respect to time horizon T.
Ensures linear speedup as the number of agents increases.
Requires only a sublinear number of communication rounds.
Abstract
This paper introduces a federated learning framework tailored for online combinatorial optimization with bandit feedback. In this setting, agents select subsets of arms, observe noisy rewards for these subsets without accessing individual arm information, and can cooperate and share information at specific intervals. Our framework transforms any offline resilient single-agent -approximation algorithm, having a complexity of , where the logarithm is omitted, for some function and constant , into an online multi-agent algorithm with communicating agents and an -regret of no more than . This approach not only eliminates the approximation error but also ensures sublinear growth with respect to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Auction Theory and Applications
