Adversarial Attacks on Combinatorial Multi-Armed Bandits

Rishab Balasubramanian; Jiawei Li; Prasad Tadepalli; Huazheng Wang,; Qingyun Wu; Haoyu Zhao

arXiv:2310.05308·cs.LG·June 5, 2024·1 cites

Adversarial Attacks on Combinatorial Multi-Armed Bandits

Rishab Balasubramanian, Jiawei Li, Prasad Tadepalli, Huazheng Wang,, Qingyun Wu, Haoyu Zhao

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates the vulnerability of combinatorial multi-armed bandits to reward poisoning attacks, establishing conditions for attackability, proposing an attack algorithm, and analyzing the impact of knowledge about the bandit instance.

Contribution

It introduces a comprehensive attackability condition for CMAB, reveals the dependence of attack success on adversary knowledge, and validates findings through extensive experiments.

Findings

01

Attackability depends on reward and outcome distributions.

02

Knowledge of the bandit instance affects attack success.

03

Attacks are more challenging when the environment is unknown.

Abstract

We study reward poisoning attacks on Combinatorial Multi-armed Bandits (CMAB). We first provide a sufficient and necessary condition for the attackability of CMAB, a notion to capture the vulnerability and robustness of CMAB. The attackability condition depends on the intrinsic properties of the corresponding CMAB instance such as the reward distributions of super arms and outcome distributions of base arms. Additionally, we devise an attack algorithm for attackable CMAB instances. Contrary to prior understanding of multi-armed bandits, our work reveals a surprising fact that the attackability of a specific CMAB instance also depends on whether the bandit instance is known or unknown to the adversary. This finding indicates that adversarial attacks on CMAB are difficult in practice and a general attack strategy for any CMAB instance does not exist since the environment is mostly unknown…

Peer Reviews

Decision·ICML 2024 Poster

Reviewer 01Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

1) This paper is the first to study adversarial attacks against CMAB algorithms, which is an interesting and timely topic. 2) The novel characterization of the sufficient and necessary conditions for polynomial attackability in CMAB provides insight into the distinct challenges posed by CMAB instances with polynomial costs. 3) The author presents a hard example highlighting that an instance can be polynomially attackable when the adversary is aware of the environment but becomes polynomially una

Weaknesses

1) Algorithm 1 seems to be straightforward but may lead to large attack costs when $\Delta$ is small. The attack cost's dependency on $\Delta$ from previous works [Jun et al., 2018, Liu and Shroff, 2019] is usually linear in $\sum_{i} \Delta_i$, while the dependency in this paper is $1 / \Delta_{S^*}$, which is worse than $\sum_{i} \Delta_i$ as $\Delta_i \le 1$. This is due to the lack of fine-grained attack value design. 2) While Theorem 4.1 establishes the difficulty of successfully targeti

Reviewer 02Rating 6· marginally above the acceptance thresholdConfidence 4

Strengths

Originality: The paper introduces a novel concept of polynomial attackability in the context of combinatorial multi-armed bandits (CMAB). This notion captures the vulnerability and robustness of CMAB systems, which is a unique contribution to the field. Quality: The paper provides a rigorous analysis of the attackability of CMAB systems and presents a sufficient and necessary condition for polynomial attackability. The paper also presents an attack algorithm for attackable instances. The paper

Weaknesses

1. The limitations of the findings are less discussed. The findings regard the polynomial attackability highly depends on the threat model, in which the outcome of the base arms can be modified by the adversary. However, recent researchers discussed different types of adversarial attacks on bandit and RL [1-5], including also environment poisoning attack and action poisoning attack. In the CMAB system, the environment-manipulation adversary could manipulate the reward function $r$ and the actio

Reviewer 03Rating 5· marginally below the acceptance thresholdConfidence 2

Strengths

Adversarial attack has been studied in stochastic bandits, linear bandits, adversarial bandits. The study of adversarial attack on CMAB is new. This work proposes new notions of attackability based on the structure of CMAB.

Weaknesses

While the framework follows from previous work Wang & Chen, I feel the paper could benefit from a discussion on simpler CMAB models first (i.e. without the trigger function etc. )

Code & Models

Repositories

haoyuzhao123/robust-cmab-code
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Blockchain Technology Applications and Security

MethodsBalanced Selection