Safe Online Convex Optimization with Multi-Point Feedback

Spencer Hutchinson; Mahnoosh Alizadeh

arXiv:2407.11471·cs.LG·July 17, 2024

Safe Online Convex Optimization with Multi-Point Feedback

Spencer Hutchinson, Mahnoosh Alizadeh

PDF

Open Access

TL;DR

This paper introduces a safe online convex optimization algorithm that uses multi-point zero-order feedback to achieve sublinear regret and zero constraint violation, suitable for safety-critical applications.

Contribution

It proposes a novel algorithm combining forward-difference gradient estimation with optimistic and pessimistic action sets for safety and efficiency.

Findings

01

Achieves $ ilde{O}(d \, \sqrt{T})$ regret with zero constraint violation.

02

Effectively handles unknown constraints and zero-order feedback.

03

Demonstrates empirical performance through numerical studies.

Abstract

Motivated by the stringent safety requirements that are often present in real-world applications, we study a safe online convex optimization setting where the player needs to simultaneously achieve sublinear regret and zero constraint violation while only using zero-order information. In particular, we consider a multi-point feedback setting, where the player chooses $d + 1$ points in each round (where $d$ is the problem dimension) and then receives the value of the constraint function and cost function at each of these points. To address this problem, we propose an algorithm that leverages forward-difference gradient estimation as well as optimistic and pessimistic action sets to achieve $O (d T)$ regret and zero constraint violation under the assumption that the constraint function is smooth and strongly convex. We then perform a numerical study to investigate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Wireless Network Optimization · Advanced Bandit Algorithms Research · Smart Parking Systems Research