Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack
Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo,, Michael Liston, Yingbin Liang

TL;DR
This paper introduces robust bandit algorithms that maintain near-optimal regret bounds even under powerful, probabilistic, and unbounded adversarial reward perturbations, extending the resilience of bandit solutions.
Contribution
It proposes novel median-based algorithms (med-E-UCB and med-ε-greedy) that are provably robust against unbounded, probabilistic adversarial attacks in multi-armed bandit problems.
Findings
Both algorithms achieve O(log T) regret bounds.
They are robust to arbitrary, unbounded reward perturbations.
Experimental results confirm theoretical guarantees and outperform existing methods.
Abstract
The multi-armed bandit formalism has been extensively studied under various attack models, in which an adversary can modify the reward revealed to the player. Previous studies focused on scenarios where the attack value either is bounded at each round or has a vanishing probability of occurrence. These models do not capture powerful adversaries that can catastrophically perturb the revealed reward. This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks. Furthermore, the attack value does not necessarily follow a statistical distribution. We propose a novel sample median-based and exploration-aided UCB algorithm (called med-E-UCB) and a median-based -greedy algorithm (called med--greedy). Both of these algorithms are provably robust to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms
