Heterogeneous Multi-agent Multi-armed Bandits on Stochastic Block Models
Mengfan Xu, Liren Shan, Fatemeh Ghaffari, Xuchuang Wang, Xutong Liu,, Mohammad Hajiesmaili

TL;DR
This paper introduces a new multi-agent multi-armed bandit framework on stochastic block models, addressing reward heterogeneity and graph randomness, with a novel algorithm achieving near-optimal regret bounds.
Contribution
It proposes a novel algorithm for heterogeneous multi-agent bandits on stochastic block models, handling both known and unknown cluster structures with improved regret bounds.
Findings
Achieves logarithmic regret bounds under sub-Gaussian rewards.
Handles both known and unknown cluster structures effectively.
Provides scalable solutions with relaxed assumptions compared to prior work.
Abstract
We study a novel heterogeneous multi-agent multi-armed bandit problem with a cluster structure induced by stochastic block models, influencing not only graph topology, but also reward heterogeneity. Specifically, agents are distributed on random graphs based on stochastic block models - a generalized Erdos-Renyi model with heterogeneous edge probabilities: agents are grouped into clusters (known or unknown); edge probabilities for agents within the same cluster differ from those across clusters. In addition, the cluster structure in stochastic block model also determines our heterogeneous rewards. Rewards distributions of the same arm vary across agents in different clusters but remain consistent within a cluster, unifying homogeneous and heterogeneous settings and varying degree of heterogeneity, and rewards are independent samples from these distributions. The objective is to minimize…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Game Theory and Applications
