Beyond $\tilde{O}(\sqrt{T})$ Constraint Violation for Online Convex Optimization with Adversarial Constraints
Abhishek Sinha, Rahul Vaze

TL;DR
This paper introduces a new online policy for convex optimization with adversarial constraints that balances regret and constraint violation, achieving smaller CCV than previous methods especially in safety-critical applications.
Contribution
It proposes a novel trade-off approach that reduces constraint violation at the expense of some regret, with efficient algorithms for special and general cases.
Findings
Achieves $ ilde{O}(rac{dT^{1-eta}})$ CCV with tunable $eta$
Develops an efficient policy for the constrained expert problem
Extends results to smooth functions with improved bounds
Abstract
We study Online Convex Optimization with adversarial constraints (COCO). At each round a learner selects an action from a convex decision set and then an adversary reveals a convex cost and a convex constraint function. The goal of the learner is to select a sequence of actions to minimize both regret and the cumulative constraint violation (CCV) over a horizon of length . The best-known policy for this problem achieves regret and CCV. In this paper, we improve this by trading off regret to achieve substantially smaller CCV. This trade-off is especially important in safety-critical applications, where satisfying the safety constraints is non-negotiable. Specifically, for any bounded convex cost and constraint functions, we propose an online policy that achieves regret and CCV, where is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Game Theory and Applications
MethodsSparse Evolutionary Training
