Regret Lower Bounds in Multi-agent Multi-armed Bandit

Mengfan Xu; Diego Klabjan

arXiv:2308.08046·cs.LG·August 17, 2023

Regret Lower Bounds in Multi-agent Multi-armed Bandit

Mengfan Xu, Diego Klabjan

PDF

Open Access

TL;DR

This paper provides the first comprehensive analysis of regret lower bounds in multi-agent multi-armed bandit problems across various settings, establishing tight bounds and bridging gaps with existing upper bounds.

Contribution

It introduces tight regret lower bounds for multiple scenarios in multi-agent bandits, including stochastic, adversarial, connected, and disconnected graph settings.

Findings

01

Lower bound of O(log T) for stochastic, connected graphs

02

Lower bound of √T for mean-gap independent stochastic case

03

Lower bound of O(T^{2/3}) for adversarial rewards

Abstract

Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context. Recently, Multi-agent Multi-armed Bandit has gained significant traction in various domains, where individual clients face bandit problems in a distributed manner and the objective is the overall system performance, typically measured by regret. While efficient algorithms with regret upper bounds have emerged, limited attention has been given to the corresponding regret lower bounds, except for a recent lower bound for adversarial settings, which, however, has a gap with let known upper bounds. To this end, we herein provide the first comprehensive study on regret lower bounds across different settings and establish their tightness. Specifically, when the graphs exhibit good connectivity properties and the rewards are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems