Multi-agent Multi-armed Bandit with Fully Heavy-tailed Dynamics

Xingyu Wang; Mengfan Xu

arXiv:2501.19239·cs.LG·February 3, 2025

Multi-agent Multi-armed Bandit with Fully Heavy-tailed Dynamics

Xingyu Wang, Mengfan Xu

PDF

Open Access

TL;DR

This paper introduces algorithms for decentralized multi-agent multi-armed bandits in fully heavy-tailed environments, addressing challenges in communication and inference with heavy-tailed rewards and sparse graph structures, achieving near-optimal regret bounds.

Contribution

First to analyze multi-agent bandits with fully heavy-tailed reward distributions and sparse graph communication, providing new algorithms with provable regret bounds.

Findings

01

Achieved regret bounds of $O(M^{1 - 1/\alpha} \log T)$ for homogeneous rewards.

02

Established $O(M \log T)$ regret bounds for heterogeneous reward settings.

03

Utilized hub-like structures and new information delay bounds to improve performance.

Abstract

We study decentralized multi-agent multi-armed bandits in fully heavy-tailed settings, where clients communicate over sparse random graphs with heavy-tailed degree distributions and observe heavy-tailed (homogeneous or heterogeneous) reward distributions with potentially infinite variance. The objective is to maximize system performance by pulling the globally optimal arm with the highest global reward mean across all clients. We are the first to address such fully heavy-tailed scenarios, which capture the dynamics and challenges in communication and inference among multiple clients in real-world systems. In homogeneous settings, our algorithmic framework exploits hub-like structures unique to heavy-tailed graphs, allowing clients to aggregate rewards and reduce noises via hub estimators when constructing UCB indices; under $M$ clients and degree distributions with power-law index…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics