Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms
Xinyi Hu, Aldo Pacchiano

TL;DR
This paper introduces a decentralized algorithm for multi-player multi-armed bandits with unknown arm capacities and no collision sensing, achieving logarithmic regret.
Contribution
It presents A-CAPELLA, a novel decentralized algorithm that handles capacity-aware learning and coordination under limited feedback.
Findings
Achieves logarithmic regret in the generalized capacity-aware setting.
Handles severe feedback limitations without collision sensing.
Provides a protocol-driven coordination mechanism.
Abstract
We study the decentralized multi-player multi-armed bandits (MMAB) problem under a no-sensing setting, where each player receives only their own reward and obtains no information about collisions. Each arm has an unknown capacity, and if the number of players pulling an arm exceeds its capacity, all players involved receive zero reward. This setting generalizes the classical unit-capacity model and introduces new challenges in coordination and capacity discovery under severe feedback limitations. We propose A-CAPELLA (Algorithm for Capacity-Aware Parallel Elimination for Learning and Allocation), a decentralized learning algorithm that achieves logarithmic regret in this generalized regime via protocol-driven coordination.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
