Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial   Constraints

Martino Bernasconi; Matteo Castiglioni; Andrea Celli; Federico Fusco

arXiv:2405.16118·cs.LG·May 28, 2024

Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints

Martino Bernasconi, Matteo Castiglioni, Andrea Celli, Federico Fusco

PDF

Open Access

TL;DR

This paper introduces a novel optimistic approach for constrained bandit problems that performs optimally in both stochastic and adversarial settings without requiring Slater's condition, simplifying previous methods.

Contribution

It proposes a new algorithm based on optimistic constraint estimation that achieves best-of-both-worlds performance with fewer assumptions and simpler analysis.

Findings

01

Achieves logarithmic bounds in the number of constraints.

02

Provides 7 7 7( 7 7b7b7T) regret in stochastic settings without Slater's condition.

03

Simplifies previous primal-dual methods with a cleaner, more natural approach.

Abstract

We address a generalization of the bandit with knapsacks problem, where a learner aims to maximize rewards while satisfying an arbitrary set of long-term constraints. Our goal is to design best-of-both-worlds algorithms that perform optimally under both stochastic and adversarial constraints. Previous works address this problem via primal-dual methods, and require some stringent assumptions, namely the Slater's condition, and in adversarial settings, they either assume knowledge of a lower bound on the Slater's parameter, or impose strong requirements on the primal and dual regret minimizers such as requiring weak adaptivity. We propose an alternative and more natural approach based on optimistic estimations of the constraints. Surprisingly, we show that estimating the constraints with an UCB-like approach guarantees optimal performances. Our algorithm consists of two main components:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications