Data Poisoning Attacks on Stochastic Bandits
Fang Liu, Ness Shroff

TL;DR
This paper investigates adversarial attacks on stochastic bandit algorithms, revealing vulnerabilities where attackers can manipulate rewards to hijack decision-making with minimal cost, posing security risks.
Contribution
It introduces a framework for offline and online attacks on bandit algorithms, including an adaptive attack strategy effective against any bandit algorithm.
Findings
Attacker can force bandits to choose a target arm with high probability.
Adaptive attack causes linear regret with logarithmic attacker cost.
Reveals significant security vulnerabilities in stochastic bandit algorithms.
Abstract
Stochastic multi-armed bandits form a class of online learning problems that have important applications in online recommendation systems, adaptive medical treatment, and many others. Even though potential attacks against these learning algorithms may hijack their behavior, causing catastrophic loss in real-world applications, little is known about adversarial attacks on bandit algorithms. In this paper, we propose a framework of offline attacks on bandit algorithms and study convex optimization based attacks on several popular bandit algorithms. We show that the attacker can force the bandit algorithm to pull a target arm with high probability by a slight manipulation of the rewards in the data. Then we study a form of online attacks on bandit algorithms and propose an adaptive attack strategy against any bandit algorithm without the knowledge of the bandit algorithm. Our adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Adversarial Robustness in Machine Learning
