Adaptive-treed bandits

Adam D. Bull

arXiv:1302.2489·math.ST·September 30, 2015

Adaptive-treed bandits

Adam D. Bull

PDF

TL;DR

This paper introduces an adaptive algorithm for noisy global optimization and continuum-armed bandits that achieves optimal regret bounds without prior information, by reducing problems to tree-armed bandits and adaptively combining multiple trees.

Contribution

It presents a novel adaptive algorithm for continuum-armed bandits that attains optimal regret bounds and introduces new results in the tree-armed bandit setting.

Findings

01

Achieves square-root regret in bandits and inverse-square-root error in optimization.

02

Effectively combines multiple trees to minimize regret.

03

Provides near-matching lower bounds on regret based on zooming dimension.

Abstract

We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.