The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Zhe Feng; David C. Parkes; Haifeng Xu

arXiv:1906.01528·cs.LG·November 16, 2020·5 cites

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation

Zhe Feng, David C. Parkes, Haifeng Xu

PDF

Open Access 1 Video

TL;DR

This paper investigates how stochastic bandit algorithms like UCB, psilon-Greedy, and Thompson Sampling perform under strategic manipulation by rational arms, showing they remain robust with regret bounds tight under certain conditions.

Contribution

It provides the first analysis of the robustness of classic bandit algorithms against strategic manipulation, establishing tight regret bounds even with adaptive arm strategies.

Findings

01

All three algorithms achieve ( ext{max}{B}, K ext{ln} T) regret bounds.

02

The regret bounds are tight even under Nash equilibrium strategies.

03

Robustness holds as long as total manipulation budget B is o(T).

Abstract

Motivated by economic applications such as recommender systems, we study the behavior of stochastic bandits algorithms under \emph{strategic behavior} conducted by rational actors, i.e., the arms. Each arm is a \emph{self-interested} strategic player who can modify its own reward whenever pulled, subject to a cross-period budget constraint, in order to maximize its own expected number of times of being pulled. We analyze the robustness of three popular bandit algorithms: UCB, $ε$ -Greedy, and Thompson Sampling. We prove that all three algorithms achieve a regret upper bound $O (max {B, K ln T})$ where $B$ is the total budget across arms, $K$ is the total number of arms and $T$ is length of the time horizon. This regret guarantee holds under \emph{arbitrary adaptive} manipulation strategy of arms. Our second set of main results shows that this regret bound is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Reinforcement Learning in Robotics