Few-Shot Bayesian Optimization with Deep Kernel Surrogates
Martin Wistuba, Josif Grabocka

TL;DR
This paper introduces a few-shot Bayesian optimization approach using a deep kernel surrogate model that is meta-learned to quickly adapt to new hyperparameter tuning tasks, achieving state-of-the-art results.
Contribution
It proposes a novel deep kernel Gaussian process surrogate trained via meta-learning for rapid adaptation in hyperparameter optimization.
Findings
Achieves state-of-the-art HPO results on diverse datasets.
Demonstrates effective few-shot learning for hyperparameter tuning.
Outperforms recent transfer learning methods in HPO.
Abstract
Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e.g. validation error). Unfortunately, evaluating the response function is computationally intensive. As a remedy, earlier work emphasizes the need for transfer learning surrogates which learn to optimize hyperparameters for an algorithm from other tasks. In contrast to previous work, we propose to rethink HPO as a few-shot learning problem in which we train a shared deep surrogate model to quickly adapt (with few response evaluations) to the response function of a new task. We propose the use of a deep kernel network for a Gaussian process surrogate that is meta-learned in an end-to-end fashion in order to jointly approximate the response functions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
MethodsHyper-parameter optimization · Gaussian Process
