Truly Adapting to Adversarial Constraints in Constrained MABs
Francesco Emanuele Stradi, Kalana Kalupahana, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

TL;DR
This paper develops algorithms for constrained multi-armed bandit problems that adapt to unknown, possibly adversarial constraints, achieving near-optimal regret and violation bounds under various feedback settings.
Contribution
It introduces the first algorithms that achieve optimal regret and positive constraint violation in non-stationary environments with stochastic and adversarial constraints.
Findings
Achieves $ ilde{O}( oot{T}+C)$ regret and violation with full feedback.
Extends guarantees to bandit feedback for losses.
Designs algorithms with bounded positive constraint violation under bandit feedback.
Abstract
We study the constrained variant of the \emph{multi-armed bandit} (MAB) problem, in which the learner aims not only at minimizing the total loss incurred during the learning dynamic, but also at controlling the violation of multiple \emph{unknown} constraints, under both \emph{full} and \emph{bandit feedback}. We consider a non-stationary environment that subsumes both stochastic and adversarial models and where, at each round, both losses and constraints are drawn from distributions that may change arbitrarily over time. In such a setting, it is provably not possible to guarantee both sublinear regret and sublinear violation. Accordingly, prior work has mainly focused either on settings with stochastic constraints or on relaxing the benchmark with fully adversarial constraints (\emph{e.g.}, via competitive ratios with respect to the optimum). We provide the first algorithms that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
