UCBoost: A Boosting Approach to Tame Complexity and Optimality for   Stochastic Bandits

Fang Liu; Sinong Wang; Swapna Buccapatnam; Ness Shroff

arXiv:1804.05929·cs.LG·April 18, 2018·1 cites

UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

PDF

Open Access

TL;DR

This paper introduces UCBoost, a boosting approach for stochastic bandit algorithms that balances near-optimal regret with low computational complexity, making it practical for real-world applications.

Contribution

It proposes UCBoost algorithms that achieve near-optimal regret with significantly reduced computational complexity compared to existing methods.

Findings

01

UCBoost($D$) has $O(1)$ complexity per arm per round with regret close to kl-UCB.

02

UCBoost($psilon$) offers $psilon$ regret proximity to kl-UCB with $O(log(1/psilon))$ complexity.

03

Numerical results show UCBoost($psilon$) matches kl-UCB regret with only 1% of its computational cost.

Abstract

In this work, we address the open problem of finding low-complexity near-optimal multi-armed bandit algorithms for sequential decision making problems. Existing bandit algorithms are either sub-optimal and computationally simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We propose a boosting approach to Upper Confidence Bound based algorithms for stochastic bandits, that we call UCBoost. Specifically, we propose two types of UCBoost algorithms. We show that UCBoost( $D$ ) enjoys $O (1)$ complexity for each arm per round as well as regret guarantee that is $1/ e$ -close to that of the kl-UCB algorithm. We propose an approximation-based UCBoost algorithm, UCBoost( $ϵ$ ), that enjoys a regret guarantee $ϵ$ -close to that of kl-UCB as well as $O (lo g (1/ ϵ))$ complexity for each arm per round. Hence, our algorithms provide practitioners a practical way…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Stochastic Gradient Optimization Techniques