Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent   Misspecification

Haolin Liu; Artin Tajdini; Andrew Wagenmaker; Chen-Yu Wei

arXiv:2410.07533·cs.LG·October 21, 2024

Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification

Haolin Liu, Artin Tajdini, Andrew Wagenmaker, Chen-Yu Wei

PDF

Open Access 1 Video

TL;DR

This paper studies corruption-robust linear bandit algorithms, characterizing minimax regret bounds under different corruption models, and introduces optimal algorithms for gap-dependent misspecification, advancing understanding in robust learning and reinforcement learning.

Contribution

It provides a unified analysis of corruption types in linear bandits, characterizes the regret gap, and develops optimal algorithms for gap-dependent misspecification, extending to reinforcement learning settings.

Findings

01

Unified framework for strong and weak corruption models.

02

Full characterization of minimax regret gap in stochastic linear bandits.

03

Optimal algorithms for gap-dependent misspecification in linear bandits.

Abstract

In linear bandits, how can a learner effectively learn when facing corrupted rewards? While significant work has explored this question, a holistic understanding across different adversarial models and corruption measures is lacking, as is a full characterization of the minimax regret bounds. In this work, we compare two types of corruptions commonly considered: strong corruption, where the corruption level depends on the action chosen by the learner, and weak corruption, where the corruption level does not depend on the action chosen by the learner. We provide a unified framework to analyze these corruptions. For stochastic linear bandits, we fully characterize the gap between the minimax regret under strong and weak corruptions. We also initiate the study of corrupted adversarial linear bandits, obtaining upper and lower bounds with matching dependencies on the corruption level. Next,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Financial Markets and Investment Strategies · Decision-Making and Behavioral Economics