Improving the Knowledge Gradient Algorithm
Yang Le, Gao Siyang, Ho Chin Pang

TL;DR
This paper introduces an improved knowledge gradient (iKG) policy for best arm identification that is asymptotically optimal and easier to extend to variants, outperforming the original KG in numerical tests.
Contribution
The paper proposes the iKG policy, enhancing KG by focusing on the probability of selecting the best arm, and proves its asymptotic optimality and extendability.
Findings
iKG is asymptotically optimal.
iKG outperforms KG in numerical experiments.
iKG is easier to extend to BAI variants.
Abstract
The knowledge gradient (KG) algorithm is a popular policy for the best arm identification (BAI) problem. It is built on the simple idea of always choosing the measurement that yields the greatest expected one-step improvement in the estimate of the best mean of the arms. In this research, we show that this policy has limitations, causing the algorithm not asymptotically optimal. We next provide a remedy for it, by following the manner of one-step look ahead of KG, but instead choosing the measurement that yields the greatest one-step improvement in the probability of selecting the best arm. The new policy is called improved knowledge gradient (iKG). iKG can be shown to be asymptotically optimal. In addition, we show that compared to KG, it is easier to extend iKG to variant problems of BAI, with the -good arm identification and feasible arm identification as two examples. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Control Systems and Identification
