Multi-scale exploration of convex functions and bandit convex optimization
S\'ebastien Bubeck, Ronen Eldan

TL;DR
This paper introduces a novel multi-scale exploration method for convex functions, enabling the resolution of a long-standing open problem in adversarial bandit convex optimization by achieving near-optimal regret bounds.
Contribution
It constructs a new mapping from convex functions to distributions that facilitate multi-scale exploration, solving a decade-old open problem in bandit convex optimization.
Findings
Achieved $ ilde{O}( ext{poly}(n) \, \sqrt{T})$ minimax regret bound.
Developed a new map for multi-scale exploration of convex functions.
Connected Bayesian and adversarial regret analyses through this exploration.
Abstract
We construct a new map from a convex function to a distribution on its domain, with the property that this distribution is a multi-scale exploration of the function. We use this map to solve a decade-old open problem in adversarial bandit convex optimization by showing that the minimax regret for this problem is , where is the dimension and the number of rounds. This bound is obtained by studying the dual Bayesian maximin regret via the information ratio analysis of Russo and Van Roy, and then using the multi-scale exploration to solve the Bayesian problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Optimization and Search Problems
