Decentralized Online Bandit Optimization on Directed Graphs with Regret   Bounds

Johan \"Ostman; Ather Gattami; Daniel Gillblad

arXiv:2301.11802·cs.LG·January 30, 2023

Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds

Johan \"Ostman, Ather Gattami, Daniel Gillblad

PDF

Open Access

TL;DR

This paper introduces a decentralized online bandit optimization algorithm for multiplayer games on directed acyclic graphs, achieving sub-linear regret and analyzing the cost of decentralization.

Contribution

It proposes a novel learning algorithm for decentralized bandit games on directed graphs with theoretical regret bounds and decentralization cost analysis.

Findings

01

Achieves sub-linear joint pseudo-regret in both adversarial and stochastic settings.

02

Quantifies the additional cost due to decentralization compared to centralized algorithms.

03

Provides a framework for decentralized learning in hierarchical multiplayer bandit problems.

Abstract

We consider a decentralized multiplayer game, played over $T$ rounds, with a leader-follower hierarchy described by a directed acyclic graph. For each round, the graph structure dictates the order of the players and how players observe the actions of one another. By the end of each round, all players receive a joint bandit-reward based on their joint action that is used to update the player strategies towards the goal of minimizing the joint pseudo-regret. We present a learning algorithm inspired by the single-player multi-armed bandit problem and show that it achieves sub-linear joint pseudo-regret in the number of rounds for both adversarial and stochastic bandit rewards. Furthermore, we quantify the cost incurred due to the decentralized nature of our problem compared to the centralized setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Auction Theory and Applications