Constrained Contextual Bandits with Adversarial Contexts
Dhruv Sarkar, Abhishek Sinha

TL;DR
This paper introduces a new framework for budget-constrained contextual bandits with adversarial contexts, improving guarantees and efficiency over prior stochastic-focused methods.
Contribution
It proposes a modular reduction approach leveraging online regression oracles to handle adversarial contexts in budget-constrained bandits.
Findings
Achieves improved guarantees for adversarial contexts.
Provides an efficient algorithm with transparent analysis.
Extends the $ extsf{SquareCB}$ framework to adversarial settings.
Abstract
We study budget-constrained contextual bandits with adversarial contexts, where each action yields a random reward and incurs a random cost. We adopt the standard realizability assumption: conditioned on the observed context, rewards and costs are drawn independently from fixed distributions whose expectations belong to known function classes. We focus on the continuing setting, in which the algorithm operates over the entire horizon even after the budget for cumulative cost is exhausted. In this setting, the objective is to simultaneously control regret and the violation of the budget constraint. Building on the seminal framework of Foster et al. [2018], we propose a simple and modular framework that leverages online regression oracles to reduce the constrained problem to a standard unconstrained contextual bandit problem with adaptively defined surrogate reward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
