Stochastic Bandits Robust to Adversarial Attacks

Xuchuang Wang; Jinhang Zuo; Xutong Liu; John C.S. Lui; Mohammad; Hajiesmaili

arXiv:2408.08859·cs.LG·August 19, 2024

Stochastic Bandits Robust to Adversarial Attacks

Xuchuang Wang, Jinhang Zuo, Xutong Liu, John C.S. Lui, Mohammad, Hajiesmaili

PDF

Open Access 1 Video 3 Reviews

TL;DR

This paper develops robust stochastic bandit algorithms resilient to adversarial reward manipulations, providing tight regret bounds and demonstrating a fundamental difference from corruption models.

Contribution

It introduces new algorithms with proven regret bounds for both known and unknown attack budgets, advancing robustness in adversarial bandit settings.

Findings

01

Achieves regret bounds of O((K/Δ) log T + KC) and √(KTC) for known attack budgets.

02

Achieves regret bounds of √(KT) + KC^2 and KC√T for unknown attack budgets.

03

Provides lower bounds confirming the optimality of the proposed algorithms.

Abstract

This paper investigates stochastic multi-armed bandit algorithms that are robust to adversarial attacks, where an attacker can first observe the learner's action and {then} alter their reward observation. We study two cases of this model, with or without the knowledge of an attack budget $C$ , defined as an upper bound of the summation of the difference between the actual and altered rewards. For both cases, we devise two types of algorithms with regret bounds having additive or multiplicative $C$ dependence terms. For the known attack budget case, we prove our algorithms achieve the regret bound of $O ((K /Δ) lo g T + K C)$ and $\tilde{O} (K T C)$ for the additive and multiplicative $C$ terms, respectively, where $K$ is the number of arms, $T$ is the time horizon, $Δ$ is the gap between the expected rewards of the optimal arm and the second-best arm, and $\tilde{O}$ hides…

Peer Reviews

Decision·ICLR 2025 Poster

Reviewer 01Rating 8Confidence 2

Strengths

1) The paper explores an under-studied area of stochastic bandits where adversarial attacks are present and obtains novel results as well as improving prior work about the already studied corruption model. 2) The theoretical bounds are tight (up to log terms), with mathematical proofs for each statement. 3) Experimental results to validate the theoretical claims are provided. 4) The authors do a good job clarifying the differences between corruption and attack models, highlighting the need for s

Weaknesses

1) The implications for practical settings, such as recommendation systems or online auctions, could use some expanding. 2) Unfortunately, all the proofs are relegated to the appendix.

Reviewer 02Rating 6Confidence 4

Strengths

- This paper addresses a gap in the literature, recognizing that adversarial attacks have not been thoroughly explored within the classical multi-armed bandit (MAB) framework and effectively filling this gap. - The authors examine both additive and multiplicative bounds, providing a clear comparison that shows which approach performs better based on the attack budget C. - Figures 1 and, especially, Figure 2 nicely illustrate the results of attack-based multiplicative and additive bounds, offerin

Weaknesses

1. Algorithm Design: I didn’t notice any novel or original elements in terms of algorithm design. The PE algorithm has been applied in this context in prior work (cited below), and the idea of using CORRAL has already been explored in similar settings, such as in Misspecified Gaussian Process Bandit Optimization. However, I only find this to be a minor weakness of the paper. 2. Terminology: I like the terminology of “attacks” to distinguish it from the classical “corrupted” setting. However, i

Reviewer 03Rating 6Confidence 2

Strengths

The paper advances the state of the art on algorithms robust to adversarial attacks. The paper is well-written and the relationship/improvement relative to previous work is well described.

Weaknesses

The technical contribution is quite weak. For instance, the algorithmic approaches follow previous work and the analysis is not very involved.

Videos

Stochastic Bandits Robust to Adversarial Attacks· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research · Data Stream Mining Techniques