A Second-Order Method for Stochastic Bandit Convex Optimisation
Tor Lattimore, Andr\'as Gy\"orgy

TL;DR
This paper presents a new second-order method for stochastic bandit convex optimization that achieves near-optimal regret bounds, improving efficiency in high-dimensional settings with unknown minimizer location.
Contribution
The paper introduces a simple, efficient second-order algorithm for zeroth-order stochastic convex bandits with provably tight regret bounds.
Findings
Regret bound of $(1 + r/d)[d^{1.5} \sqrt{n} + d^3]$ proven for the algorithm
Algorithm performs well in high-dimensional stochastic convex bandit problems
Regret scales favorably with dimension and horizon
Abstract
We introduce a simple and efficient algorithm for unconstrained zeroth-order stochastic convex bandits and prove its regret is at most where is the horizon, the dimension and is the radius of a known ball containing the minimiser of the loss.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Risk and Portfolio Optimization · Sparse and Compressive Sensing Techniques
