Data-Dependent Regret Bounds for Constrained MABs
Gianmarco Genalti, Francesco Emanuele Stradi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

TL;DR
This paper develops data-dependent regret bounds for constrained multi-armed bandits, showing they can be significantly tighter than classical bounds and are fundamental to understanding the problem's complexity.
Contribution
It introduces the first data-dependent regret bounds for constrained MABs with adversarial losses and stochastic constraints, including a new algorithm and lower bounds.
Findings
Regret bounds with two data-dependent terms capturing constraint difficulty and learning complexity
Lower bounds confirming the fundamental nature of these two terms
Novel results in soft constraints settings that may be of independent interest
Abstract
This paper initiates the study of data-dependent regret bounds in constrained MAB settings. These bounds depend on the sequence of losses that characterize the problem instance. Thus, they can be much smaller than classical regret bounds, while being equivalent to them in the worst case. Despite this, data-dependent regret bounds have been completely overlooked in constrained MAB settings. The goal of this paper is to answer the following question: Can data-dependent regret bounds be derived in the presence of constraints? We answer this question affirmatively in constrained MABs with adversarial losses and stochastic constraints. Specifically, our main focus is on the most challenging and natural settings with hard constraints, where the learner must ensure that the constraints are always satisfied with high probability. We design an algorithm with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Neural Networks and Applications
MethodsFocus
