Robust Stochastic Bandit Algorithms under Probabilistic Unbounded   Adversarial Attack

Ziwei Guan; Kaiyi Ji; Donald J Bucci Jr; Timothy Y Hu; Joseph Palombo,; Michael Liston; Yingbin Liang

arXiv:2002.07214·cs.LG·February 19, 2020·1 cites

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack

Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo,, Michael Liston, Yingbin Liang

PDF

Open Access

TL;DR

This paper introduces robust bandit algorithms that maintain near-optimal regret bounds even under powerful, probabilistic, and unbounded adversarial reward perturbations, extending the resilience of bandit solutions.

Contribution

It proposes novel median-based algorithms (med-E-UCB and med-ε-greedy) that are provably robust against unbounded, probabilistic adversarial attacks in multi-armed bandit problems.

Findings

01

Both algorithms achieve O(log T) regret bounds.

02

They are robust to arbitrary, unbounded reward perturbations.

03

Experimental results confirm theoretical guarantees and outperform existing methods.

Abstract

The multi-armed bandit formalism has been extensively studied under various attack models, in which an adversary can modify the reward revealed to the player. Previous studies focused on scenarios where the attack value either is bounded at each round or has a vanishing probability of occurrence. These models do not capture powerful adversaries that can catastrophically perturb the revealed reward. This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks. Furthermore, the attack value does not necessarily follow a statistical distribution. We propose a novel sample median-based and exploration-aided UCB algorithm (called med-E-UCB) and a median-based $ϵ$ -greedy algorithm (called med- $ϵ$ -greedy). Both of these algorithms are provably robust to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms