Cost-Efficient Online Hyperparameter Optimization
Jingkang Wang, Mengye Ren, Ilija Bogunovic, Yuwen Xiong, Raquel, Urtasun

TL;DR
This paper introduces a cost-efficient online hyperparameter optimization method that adaptively queries validation data, significantly reducing training costs while achieving expert-level performance on image classification tasks.
Contribution
It proposes a novel Bayesian optimization algorithm with a costly feedback setting that improves efficiency in online hyperparameter tuning.
Findings
Achieves human expert-level hyperparameter tuning performance.
Reduces computational overhead compared to standard methods.
Effective on CIFAR-10 and ImageNet100 datasets.
Abstract
Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters. However, these online HPO algorithms still require running evaluation on a set of validation examples at each training step, steeply increasing the training cost. To decide when to query the validation loss, we model online HPO as a time-varying Bayesian optimization problem, on top of which we propose a novel \textit{costly feedback} setting to capture the concept of the query cost. Under this setting, standard algorithms are cost-inefficient as they evaluate on the validation set at every round. In contrast, the cost-efficient GP-UCB algorithm proposed in this paper queries the unknown function only when the model is less confident about current decisions. We evaluate our proposed algorithm by tuning hyperparameters online for VGG and ResNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Bandit Algorithms Research · Advanced Neural Network Applications
MethodsHyper-parameter optimization · Average Pooling · Softmax · Dropout · Batch Normalization · Dense Connections · 1x1 Convolution · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Residual Connection
