Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

Renzhe Xu; Haotian Wang; Xingxuan Zhang; Bo Li; Peng Cui

arXiv:2305.19158·cs.LG·August 7, 2023·2 cites

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

Renzhe Xu, Haotian Wang, Xingxuan Zhang, Bo Li, Peng Cui

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a new multi-player multi-armed bandit model where selfish agents compete and share rewards, analyzes equilibrium strategies, and proposes an algorithm with strong theoretical guarantees validated through experiments.

Contribution

The paper models a novel competitive multi-player bandit setting with reward sharing, analyzes Nash equilibrium, and proposes SMAA with proven regret bounds and stability properties.

Findings

01

SMAA achieves low regret for all players.

02

No single player can significantly improve rewards by deviation.

03

The method performs well in synthetic experiments.

Abstract

Competitions for shareable and limited resources have long been studied with strategic agents. In reality, agents often have to learn and maximize the rewards of the resources at the same time. To design an individualized competing policy, we model the competition between agents in a novel multi-player multi-armed bandit (MPMAB) setting where players are selfish and aim to maximize their own rewards. In addition, when several players pull the same arm, we assume that these players averagely share the arms' rewards by expectation. Under this setting, we first analyze the Nash equilibrium when arms' rewards are known. Subsequently, we propose a novel Selfish MPMAB with Averaging Allocation (SMAA) approach based on the equilibrium. We theoretically demonstrate that SMAA could achieve a good regret guarantee for each player when all players follow the algorithm. Additionally, we establish…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

windxrz/smaa
noneOfficial

Videos

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Experimental Behavioral Economics Studies