Bandits Corrupted by Nature: Lower Bounds on Regret and Robust   Optimistic Algorithm

Debabrota Basu; Odalric-Ambrym Maillard; Timoth\'ee Mathieu

arXiv:2203.03186·cs.LG·March 22, 2023·1 cites

Bandits Corrupted by Nature: Lower Bounds on Regret and Robust Optimistic Algorithm

Debabrota Basu, Odalric-Ambrym Maillard, Timoth\'ee Mathieu

PDF

Open Access

TL;DR

This paper investigates the corrupted bandit problem with heavy-tailed and adversarial corruptions, establishing lower bounds on regret and proposing robust algorithms, HubUCB and SeqHubUCB, with near-optimal performance and improved computational efficiency.

Contribution

The paper introduces a problem-dependent regret lower bound for corrupted bandits and proposes two robust UCB-type algorithms, including a sequential estimator for efficiency.

Findings

01

HubUCB achieves near-optimal regret bounds.

02

SeqHubUCB reduces computational complexity to linear time.

03

Algorithms perform well across various reward distributions and corruption levels.

Abstract

We study the corrupted bandit problem, i.e. a stochastic multi-armed bandit problem with $k$ unknown reward distributions, which are heavy-tailed and corrupted by a history-independent adversary or Nature. To be specific, the reward obtained by playing an arm comes from corresponding heavy-tailed reward distribution with probability $1 - ε \in (0.5, 1]$ and an arbitrary corruption distribution of unbounded support with probability $ε \in [0, 0.5)$ . First, we provide $a problem-dependent lower bound on the regret$ of any corrupted bandit algorithm. The lower bounds indicate that the corrupted bandit problem is harder than the classical stochastic bandit problem with sub-Gaussian or heavy-tail rewards. Following that, we propose a novel UCB-type algorithm for corrupted bandits, namely HubUCB, that builds on Huber's estimator for robust mean estimation.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems