Combining Diverse Information for Coordinated Action: Stochastic Bandit Algorithms for Heterogeneous Agents
Lucia Gordon, Esther Rolf, Milind Tambe

TL;DR
This paper introduces Min-Width, a UCB-style algorithm for multi-agent stochastic bandits with heterogeneous rewards, enabling efficient coordination and information aggregation among diverse agents with different sensitivities.
Contribution
The paper proposes a novel algorithm, Min-Width, that effectively aggregates heterogeneous agent rewards and coordinates actions in stochastic bandit problems, addressing a gap in existing methods.
Findings
Modeling agent heterogeneity improves performance when sensitivities vary widely.
More information sharing does not always lead to better results.
Min-Width achieves favorable regret bounds in synthetic experiments.
Abstract
Stochastic multi-agent multi-armed bandits typically assume that the rewards from each arm follow a fixed distribution, regardless of which agent pulls the arm. However, in many real-world settings, rewards can depend on the sensitivity of each agent to their environment. In medical screening, disease detection rates can vary by test type; in preference matching, rewards can depend on user preferences; and in environmental sensing, observation quality can vary across sensors. Since past work does not specify how to allocate agents of heterogeneous but known sensitivity of these types in a stochastic bandit setting, we introduce a UCB-style algorithm, Min-Width, which aggregates information from diverse agents. In doing so, we address the joint challenges of (i) aggregating the rewards, which follow different distributions for each agent-arm pair, and (ii) coordinating the assignments of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Auction Theory and Applications
