Combinatorial Bandits under Strategic Manipulations

Jing Dong; Ke Li; Shuai Li; Baoxiang Wang

arXiv:2102.12722·cs.LG·November 22, 2021

Combinatorial Bandits under Strategic Manipulations

Jing Dong, Ke Li, Shuai Li, Baoxiang Wang

PDF

Open Access 1 Repo

TL;DR

This paper studies combinatorial multi-armed bandits under strategic reward manipulations, proposing a robust algorithm with bounded regret, and validates its effectiveness through extensive experiments on real and synthetic data.

Contribution

It introduces a strategic variant of the combinatorial UCB algorithm that accounts for reward manipulations and provides theoretical regret bounds and lower bounds on manipulation budgets.

Findings

01

The proposed algorithm achieves regret of O(m log T + m B_{max}) under strategic manipulations.

02

Lower bounds on arms' budgets necessary to induce certain regret levels.

03

Experimental results confirm robustness and regret bounds across various manipulation regimes.

Abstract

Strategic behavior against sequential learning methods, such as "click framing" in real recommendation systems, have been widely observed. Motivated by such behavior we study the problem of combinatorial multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest. This characterization of the adversarial behavior is a relaxation of previously well-studied settings such as adversarial attacks and adversarial corruption. We propose a strategic variant of the combinatorial UCB algorithm, which has a regret of at most $O (m lo g T + m B_{ma x})$ under strategic manipulations, where $T$ is the time horizon, $m$ is the number of arms, and $B_{ma x}$ is the maximum budget of an arm. We provide lower bounds on the budget for arms to incur certain regret of the bandit algorithm. Extensive experiments on online worker…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shirleydongj/StrategicCUCB
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Mobile Crowdsensing and Crowdsourcing · Optimization and Search Problems