An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction
Tim van Erven, Jack Mayo, Julia Olkhovskaya, Chen-Yu Wei

TL;DR
This paper introduces an efficient algorithm for adversarial linear contextual bandits that achieves near-optimal regret bounds without prior knowledge of context distributions, resolving key open questions in the field.
Contribution
The authors develop a reduction-based algorithm that attains poly(d)√T regret in polynomial time, even with adversarial losses and stochastic action sets, improving upon previous methods.
Findings
Achieves O( ext{min}\{d^2 ext{ }\sqrt{T}, ext{ }\sqrt{d^3T ext{ }\log K} ight\u007F O) regret.
First polynomial-time algorithm for combinatorial bandits with adversarial losses and stochastic action sets.
Improves regret bounds to O(d ext{ }\sqrt{L^\u00D7}) with a simulator.
Abstract
We present an efficient algorithm for linear contextual bandits with adversarial losses and stochastic action sets. Our approach reduces this setting to misspecification-robust adversarial linear bandits with fixed action sets. Without knowledge of the context distribution or access to a context simulator, the algorithm achieves regret and runs in time, where is the feature dimension, is an upper bound on the number of linear constraints defining the action set in each round, is an upper bound on the number of actions in each round, and is number of rounds. This resolves the open question by Liu et al. (2023) on whether one can obtain regret in polynomial time independent of the number of actions. For the important class of combinatorial bandits with adversarial losses and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFire Detection and Safety Systems · Anomaly Detection Techniques and Applications · Advanced Sensor and Control Systems
