UCB Exploration for Fixed-Budget Bayesian Best Arm Identification

Rong J.B. Zhu; Yanqi Qiu

arXiv:2408.04869·cs.LG·October 24, 2024

UCB Exploration for Fixed-Budget Bayesian Best Arm Identification

Rong J.B. Zhu, Yanqi Qiu

PDF

TL;DR

This paper introduces a Bayesian UCB-based algorithm for fixed-budget best-arm identification that learns prior information to improve efficiency, providing theoretical guarantees and outperforming existing methods in empirical tests.

Contribution

It proposes a novel UCB exploration algorithm that learns prior information, enhancing fixed-budget Bayesian best-arm identification with theoretical bounds and empirical superiority.

Findings

01

Achieves upper bounds on failure probability and simple regret of order $ ilde{O}(rac{ ext{sqrt}(K)}{n})$

02

Outperforms state-of-the-art baselines in empirical evaluations

03

Provides both theoretical analysis and practical improvements for Bayesian BAI

Abstract

We study best-arm identification (BAI) in the fixed-budget setting. Adaptive allocations based on upper confidence bounds (UCBs), such as UCBE, are known to work well in BAI. However, it is well-known that its optimal regret is theoretically dependent on instances, which we show to be an artifact in many fixed-budget BAI problems. In this paper we propose an UCB exploration algorithm that is both theoretically and empirically efficient for the fixed budget BAI problem under a Bayesian setting. The key idea is to learn prior information, which can enhance the performance of UCB-based BAI algorithm as it has done in the cumulative regret minimization problem. We establish bounds on the failure probability and the simple regret for the Bayesian BAI problem, providing upper bounds of order $\tilde{O} (K / n)$ , up to logarithmic factors, where $n$ represents the budget and $K$ denotes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.