Collaborating in Multi-Armed Bandits with Strategic Agents
Idan Barnea, Ofir Schlisselberg, Yishay Mansour

TL;DR
This paper introduces CAOS, a mechanism enabling strategic agents in multi-armed bandit problems to collaborate through information sharing, achieving near-cooperative performance without monetary incentives.
Contribution
It proposes CAOS, a novel mechanism that sustains collaboration among strategic agents in multi-armed bandits as a Nash equilibrium with strong regret guarantees.
Findings
CAOS maintains collaboration as a Nash equilibrium.
The mechanism achieves regret close to fully cooperative systems.
Pure information sharing can incentivize strategic agents effectively.
Abstract
We study collaborative learning in multi-agent Bayesian bandit problems, where strategic agents collectively solve the same bandit instance. While multiple agents can accelerate learning by sharing information, strategic agents might prefer to free-ride and avoid exploration. We consider a setting with persistent agents that participate in multiple time periods. This is in contrast to most previous works on incentives in multi-agent MAB, which assume short-lived agents, namely each agent has a single decision to make and optimizes their expected reward in that single decision. As in the multi-agent MAB model with incentives, our model does not have monetary transfers, and the only incentives are through information sharing. We propose \texttt{CAOS}, a mechanism that sustains collaboration as a Nash equilibrium while achieving strong regret guarantees. Our results demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
