Multi-armed Bandit Algorithm against Strategic Replication

Suho Shin; Seungjoon Lee; Jungseul Ok

arXiv:2110.12160·cs.LG·October 26, 2021·1 cites

Multi-armed Bandit Algorithm against Strategic Replication

Suho Shin, Seungjoon Lee, Jungseul Ok

PDF

Open Access

TL;DR

This paper introduces Hierarchical UCB algorithms designed to prevent strategic replication in multi-armed bandit problems, achieving low regret even with irrational agents and demonstrating effectiveness through theoretical analysis and experiments.

Contribution

The paper proposes replication-proof Hierarchical UCB algorithms that mitigate strategic arm replication and maintain low regret in multi-armed bandit settings.

Findings

01

H-UCB achieves $O( ext{log } T)$ regret under equilibrium.

02

RH-UCB maintains sublinear regret with irrational agents.

03

Algorithms are validated through numerical experiments.

Abstract

We consider a multi-armed bandit problem in which a set of arms is registered by each agent, and the agent receives reward when its arm is selected. An agent might strategically submit more arms with replications, which can bring more reward by abusing the bandit algorithm's exploration-exploitation balance. Our analysis reveals that a standard algorithm indeed fails at preventing replication and suffers from linear regret in time $T$ . We aim to design a bandit algorithm which demotivates replications and also achieves a small cumulative regret. We devise Hierarchical UCB (H-UCB) of replication-proof, which has $O (ln T)$ -regret under any equilibrium. We further propose Robust Hierarchical UCB (RH-UCB) which has a sublinear regret even in a realistic scenario with irrational agents replicating careless. We verify our theoretical findings through numerical experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications