Bandit Convex Optimisation Revisited: FTRL Achieves $\tilde{O}(t^{1/2})$   Regret

David Young; Douglas Leith; George Iosifidis

arXiv:2302.00358·cs.LG·June 27, 2023

Bandit Convex Optimisation Revisited: FTRL Achieves $\tilde{O}(t^{1/2})$ Regret

David Young, Douglas Leith, George Iosifidis

PDF

Open Access

TL;DR

This paper demonstrates that a kernel estimator can be adapted into a sampling-based bandit estimator, enabling the FTRL algorithm to achieve near-optimal regret bounds in adversarial convex optimization settings.

Contribution

It introduces a novel method to convert kernel estimators into bandit estimators and applies this to improve regret bounds for FTRL in bandit convex optimization.

Findings

01

Achieves O(t^{1/2}) regret with FTRL in bandit convex optimization.

02

Provides a simple conversion of kernel estimators into bandit estimators.

03

Enhances the theoretical understanding of bandit algorithms with kernel methods.

Abstract

We show that a kernel estimator using multiple function evaluations can be easily converted into a sampling-based bandit estimator with expectation equal to the original kernel estimate. Plugging such a bandit estimator into the standard FTRL algorithm yields a bandit convex optimisation algorithm that achieves $\tilde{O} (t^{1/2})$ regret against adversarial time-varying convex loss functions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Sparse and Compressive Sensing Techniques