A Second-Order Method for Stochastic Bandit Convex Optimisation

Tor Lattimore; Andr\'as Gy\"orgy

arXiv:2302.05371·cs.LG·February 13, 2023

A Second-Order Method for Stochastic Bandit Convex Optimisation

Tor Lattimore, Andr\'as Gy\"orgy

PDF

Open Access

TL;DR

This paper presents a new second-order method for stochastic bandit convex optimization that achieves near-optimal regret bounds, improving efficiency in high-dimensional settings with unknown minimizer location.

Contribution

The paper introduces a simple, efficient second-order algorithm for zeroth-order stochastic convex bandits with provably tight regret bounds.

Findings

01

Regret bound of $(1 + r/d)[d^{1.5} \sqrt{n} + d^3]$ proven for the algorithm

02

Algorithm performs well in high-dimensional stochastic convex bandit problems

03

Regret scales favorably with dimension and horizon

Abstract

We introduce a simple and efficient algorithm for unconstrained zeroth-order stochastic convex bandits and prove its regret is at most $(1 + r / d) [d^{1.5} n + d^{3}] p o l y l o g (n, d, r)$ where $n$ is the horizon, $d$ the dimension and $r$ is the radius of a known ball containing the minimiser of the loss.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Sparse and Compressive Sensing Techniques