Zero-Shot AutoML with Pretrained Models
Ekrem \"Ozt\"urk, Fabio Ferreira, Hadi S. Jomaa, Lars, Schmidt-Thieme, Josif Grabocka, Frank Hutter

TL;DR
This paper introduces a zero-shot AutoML method that leverages a meta-learned surrogate model to select optimal pre-trained models and hyperparameters for new datasets with minimal computation, especially effective on small datasets.
Contribution
It presents a domain-independent meta-learning approach that predicts the best deep learning pipeline for new datasets without additional training, outperforming existing methods in AutoDL benchmarks.
Findings
Outperforms all challenge contenders in the ChaLearn AutoDL benchmark
Effectively selects pre-trained models and hyperparameters with minimal compute
Demonstrates strong generalization across diverse datasets
Abstract
Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsTest
