Kernel-based methods for bandit convex optimization

S\'ebastien Bubeck; Ronen Eldan; Yin Tat Lee

arXiv:1607.03084·cs.LG·July 19, 2016

Kernel-based methods for bandit convex optimization

S\'ebastien Bubeck, Ronen Eldan, Yin Tat Lee

PDF

1 Video

TL;DR

This paper introduces a novel kernel-based algorithm for adversarial convex bandit problems, achieving improved regret bounds and polynomial-time complexity, advancing the state-of-the-art in derivative-free optimization.

Contribution

The paper presents the first polynomial-time algorithm with sublinear regret for adversarial convex bandit problems, using new kernel methods and annealing schedules.

Findings

01

Achieves $ ilde{O}(n^{9.5} \sqrt{T})$-regret with polynomial time

02

A variant runs in polynomial time with additional regret factors

03

Improves upon previous regret and time complexity bounds

Abstract

We consider the adversarial convex bandit problem and we build the first $poly (T)$ -time algorithm with $poly (n) T$ -regret for this problem. To do so we introduce three new ideas in the derivative-free optimization literature: (i) kernel methods, (ii) a generalization of Bernoulli convolutions, and (iii) a new annealing schedule for exponential weights (with increasing learning rate). The basic version of our algorithm achieves $\tilde{O} (n^{9.5} T)$ -regret, and we show that a simple variant of this algorithm can be run in $poly (n lo g (T))$ -time per step at the cost of an additional $poly (n) T^{o (1)}$ factor in the regret. These results improve upon the $\tilde{O} (n^{11} T)$ -regret and $exp (poly (T))$ -time result of the first two authors, and the $lo g (T)^{poly (n)} T$ -regret and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Kernel-Based Methods for Bandit Convex Optimization· youtube