Adaptive Sample Sharing for Multi Agent Linear Bandits
Hamza Cherkaoui, Merwan Barlier, Igor Colin

TL;DR
This paper introduces BASS, an adaptive sample sharing algorithm for multi-agent linear bandits that optimally balances bias and uncertainty, improving collaboration without assuming parameter structure, validated through theory and experiments.
Contribution
It proposes a novel adaptive sample sharing method that does not rely on prior parameter assumptions, enhancing multi-agent bandit collaboration.
Findings
BASS outperforms existing methods in regret minimization.
The algorithm accurately recovers clustered parameters.
Theoretical analysis confirms the efficiency of BASS.
Abstract
The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents' parameters display a cluster structure, our algorithm accurately recovers them.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Recommender Systems and Techniques · Image and Video Quality Assessment
