Saving Stochastic Bandits from Poisoning Attacks via Limited Data   Verification

Anshuka Rangi; Long Tran-Thanh; Haifeng Xu; Massimo Franceschetti

arXiv:2102.07711·cs.LG·May 5, 2022

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification

Anshuka Rangi, Long Tran-Thanh, Haifeng Xu, Massimo Franceschetti

PDF

Open Access 1 Video

TL;DR

This paper investigates the vulnerability of bandit algorithms to data poisoning attacks and proposes verification-based methods, including Secure-UCB and Secure-BARBAR, to effectively mitigate such attacks with limited verification resources.

Contribution

It introduces verification mechanisms for bandit algorithms that restore optimal regret under poisoning attacks, providing both upper and lower bounds on verification requirements.

Findings

01

Verification reduces attack impact to achieve near-optimal regret.

02

Secure-UCB and modified ETC algorithms recover from attacks with O(log T) verifications.

03

Secure-BARBAR achieves sublinear regret with bounded verification budget.

Abstract

We study bandit algorithms under data poisoning attacks in a bounded reward setting. We consider a strong attacker model in which the attacker can observe both the selected actions and their corresponding rewards and can contaminate the rewards with additive noise. We show that any bandit algorithm with regret $O (lo g T)$ can be forced to suffer a regret $Ω (T)$ with an expected amount of contamination $O (lo g T)$ . This amount of contamination is also necessary, as we prove that there exists an $O (lo g T)$ regret bandit algorithm, specifically the classical UCB, that requires $Ω (lo g T)$ amount of contamination to suffer regret $Ω (T)$ . To combat such attacks, our second main contribution is to propose verification based mechanisms, which use limited verification to access a limited number of uncontaminated rewards. In particular, for the case of unlimited verifications, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification· underline

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning