Collaborative Min-Max Regret in Grouped Multi-Armed Bandits
Mo\"ise Blanchard, Vineet Goyal

TL;DR
This paper introduces Col-UCB, an algorithm for shared exploration in grouped multi-armed bandits that minimizes maximum collaborative regret, effectively balancing exploration costs across groups with overlapping actions.
Contribution
The paper proposes Col-UCB, a novel algorithm that adaptively coordinates exploration in grouped bandits, achieving optimal regret bounds and insights into collaboration benefits.
Findings
Col-UCB achieves near-optimal minimax regret bounds.
Collaboration benefits depend on shared action set structure.
Algorithm adapts to different group overlap scenarios.
Abstract
We study the impact of sharing exploration in multi-armed bandits in a grouped setting where a set of groups have overlapping feasible action sets [Baek and Farias '24]. In this grouped bandit setting, groups share reward observations, and the objective is to minimize the collaborative regret, defined as the maximum regret across groups. This naturally captures applications in which one aims to balance the exploration burden between groups or populations -- it is known that standard algorithms can lead to significantly imbalanced exploration cost between groups. We address this problem by introducing an algorithm Col-UCB that dynamically coordinates exploration across groups. We show that Col-UCB achieves both optimal minimax and instance-dependent collaborative regret up to logarithmic factors. These bounds are adaptive to the structure of shared action sets between groups, providing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing · Recommender Systems and Techniques
MethodsSparse Evolutionary Training
