Multi-scale exploration of convex functions and bandit convex   optimization

S\'ebastien Bubeck; Ronen Eldan

arXiv:1507.06580·math.MG·July 24, 2015·22 cites

Multi-scale exploration of convex functions and bandit convex optimization

S\'ebastien Bubeck, Ronen Eldan

PDF

Open Access

TL;DR

This paper introduces a novel multi-scale exploration method for convex functions, enabling the resolution of a long-standing open problem in adversarial bandit convex optimization by achieving near-optimal regret bounds.

Contribution

It constructs a new mapping from convex functions to distributions that facilitate multi-scale exploration, solving a decade-old open problem in bandit convex optimization.

Findings

01

Achieved $ ilde{O}( ext{poly}(n) \, \sqrt{T})$ minimax regret bound.

02

Developed a new map for multi-scale exploration of convex functions.

03

Connected Bayesian and adversarial regret analyses through this exploration.

Abstract

We construct a new map from a convex function to a distribution on its domain, with the property that this distribution is a multi-scale exploration of the function. We use this map to solve a decade-old open problem in adversarial bandit convex optimization by showing that the minimax regret for this problem is $\tilde{O} (poly (n) T)$ , where $n$ is the dimension and $T$ the number of rounds. This bound is obtained by studying the dual Bayesian maximin regret via the information ratio analysis of Russo and Van Roy, and then using the multi-scale exploration to solve the Bayesian problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Optimization and Search Problems