TL;DR
This paper investigates why simple Bayesian optimization methods perform well in high-dimensional tasks, identifying key challenges and proposing a simple MLE-based variant that achieves state-of-the-art results.
Contribution
It uncovers the role of Gaussian process initialization and local search behaviors in high-dimensional BO success, proposing MSR for improved performance.
Findings
Vanishing gradients from GP initialization hinder high-dimensional BO.
Methods promoting local search are more effective in high dimensions.
MLE of GP length scales with MSR variant yields state-of-the-art results.
Abstract
Recent work reported that simple Bayesian optimization (BO) methods perform well for high-dimensional real-world tasks, seemingly contradicting prior work and tribal knowledge. This paper investigates why. We identify underlying challenges that arise in high-dimensional BO and explain why recent methods succeed. Our empirical analysis shows that vanishing gradients caused by Gaussian process (GP) initialization schemes play a major role in the failures of high-dimensional Bayesian optimization (HDBO) and that methods that promote local search behaviors are better suited for the task. We find that maximum likelihood estimation (MLE) of GP length scales suffices for state-of-the-art performance. Based on this, we propose a simple variant of MLE called MSR that leverages these findings to achieve state-of-the-art performance on a comprehensive set of real-world applications. We present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
