Multiplayer Bandit Learning, from Competition to Cooperation

Simina Br\^anzei; Yuval Peres

arXiv:1908.01135·cs.GT·January 15, 2024·6 cites

Multiplayer Bandit Learning, from Competition to Cooperation

Simina Br\^anzei, Yuval Peres

PDF

Open Access 1 Video

TL;DR

This paper investigates how competition and cooperation influence exploration strategies in multiplayer multi-armed bandit problems, revealing that competition reduces exploration while cooperation enhances it, with implications for strategic learning.

Contribution

It introduces a model analyzing the effects of different cooperation levels on exploration in multiplayer bandits, highlighting the contrasting behaviors and outcomes.

Findings

01

Competing players explore less than a single player.

02

Cooperating players explore more than a single player.

03

Neutral players achieve higher total rewards through mutual learning.

Abstract

The stochastic multi-armed bandit model captures the tradeoff between exploration and exploitation. We study the effects of competition and cooperation on this tradeoff. Suppose there are $k$ arms and two players, Alice and Bob. In every round, each player pulls an arm, receives the resulting reward, and observes the choice of the other player but not their reward. Alice's utility is $Γ_{A} + λ Γ_{B}$ (and similarly for Bob), where $Γ_{A}$ is Alice's total reward and $λ \in [- 1, 1]$ is a cooperation parameter. At $λ = - 1$ the players are competing in a zero-sum game, at $λ = 1$ , they are fully cooperating, and at $λ = 0$ , they are neutral: each player's utility is their own reward. The model is related to the economics literature on strategic experimentation, where usually players observe each other's rewards. With discount factor $β$ , the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Multiplayer Bandit Learning - From Competition to Cooperation· youtube

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Experimental Behavioral Economics Studies