Lipschitz Bandit Optimization with Improved Efficiency

Xu Zhu

arXiv:1904.11131·cs.LG·July 11, 2019·1 cites

Lipschitz Bandit Optimization with Improved Efficiency

Xu Zhu

PDF

Open Access

TL;DR

This paper introduces a practical and efficient algorithm for Lipschitz bandit optimization that improves computational complexity and nearly matches the theoretical regret lower bound.

Contribution

It proposes Tree UCB-Hoeffding, a novel adaptive partitioning algorithm that reduces computational costs and simplifies implementation compared to existing methods.

Findings

01

Computational cost improved to O(T log T)

02

Achieves regret close to the lower bound

03

Does not require oracle settings

Abstract

We consider the Lipschitz bandit optimization problem with an emphasis on practical efficiency. Although there is rich literature on regret analysis of this type of problem, e.g., [Kleinberg et al. 2008, Bubeck et al. 2011, Slivkins 2014], their proposed algorithms suffer from serious practical problems including extreme time complexity and dependence on oracle implementations. With this motivation, we propose a novel algorithm with an Upper Confidence Bound (UCB) exploration, namely Tree UCB-Hoeffding, using adaptive partitions. Our partitioning scheme is easy to implement and does not require any oracle settings. With a tree-based search strategy, the total computational cost can be improved to $O (T lo g T)$ for the first $T$ iterations. In addition, our algorithm achieves the regret lower bound up to a logarithmic factor.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics