Asymptotically Optimal Fixed-Budget Best Arm Identification with   Variance-Dependent Bounds

Masahiro Kato; Masaaki Imaizumi; Takuya Ishihara; Toru Kitagawa

arXiv:2302.02988·cs.LG·July 13, 2023·1 cites

Asymptotically Optimal Fixed-Budget Best Arm Identification with Variance-Dependent Bounds

Masahiro Kato, Masaaki Imaizumi, Takuya Ishihara, Toru Kitagawa

PDF

Open Access

TL;DR

This paper develops an asymptotically optimal strategy for fixed-budget best arm identification that accounts for variance-dependent bounds, ensuring minimax optimality in expected simple regret.

Contribution

It introduces the TS-HIR strategy, which achieves asymptotic minimax optimality by leveraging variance-dependent lower bounds and the HIR estimator.

Findings

01

The TS-HIR strategy attains asymptotic minimax optimality.

02

Derived variance-dependent lower bounds for worst-case simple regret.

03

Validated effectiveness through simulation experiments.

Abstract

We investigate the problem of fixed-budget best arm identification (BAI) for minimizing expected simple regret. In an adaptive experiment, a decision maker draws one of multiple treatment arms based on past observations and observes the outcome of the drawn arm. After the experiment, the decision maker recommends the treatment arm with the highest expected outcome. We evaluate the decision based on the expected simple regret, which is the difference between the expected outcomes of the best arm and the recommended arm. Due to inherent uncertainty, we evaluate the regret using the minimax criterion. First, we derive asymptotic lower bounds for the worst-case expected simple regret, which are characterized by the variances of potential outcomes (leading factor). Based on the lower bounds, we propose the Two-Stage (TS)-Hirano-Imbens-Ridder (HIR) strategy, which utilizes the HIR estimator…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Advanced Bandit Algorithms Research · Forecasting Techniques and Applications