Reducing Hyperparameter Tuning Costs in ML, Vision and Language Model Training Pipelines via Memoization-Awareness
Abdelmajid Essofi, Ridwan Salahuddeen, Munachiso Nwadike, Elnura, Zhalieva, Kun Zhang, Eric Xing, Willie Neiswanger, Qirong Ho

TL;DR
This paper introduces EEIPU, a memoization-aware Bayesian Optimization method that leverages pipeline caching to significantly reduce hyperparameter tuning costs and improve quality across ML, vision, and language model training.
Contribution
The paper presents a novel memoization-aware BO algorithm, EEIPU, which efficiently utilizes pipeline caching to enhance hyperparameter search in costly model training pipelines.
Findings
EEIPU evaluates 103% more hyperparameters within the same budget.
EEIPU achieves 108% higher validation metrics on average.
EEIPU outperforms recent BO algorithms in diverse pipelines.
Abstract
The training or fine-tuning of machine learning, vision, and language models is often implemented as a pipeline: a sequence of stages encompassing data preparation, model training and evaluation. In this paper, we exploit pipeline structures to reduce the cost of hyperparameter tuning for model training/fine-tuning, which is particularly valuable for language models given their high costs in GPU-days. We propose a "memoization-aware" Bayesian Optimization (BO) algorithm, EEIPU, that works in tandem with a pipeline caching system, allowing it to evaluate significantly more hyperparameter candidates per GPU-day than other tuning algorithms. The result is better-quality hyperparameters in the same amount of search time, or equivalently, reduced search time to reach the same hyperparameter quality. In our benchmarks on machine learning (model ensembles), vision (convolutional architecture)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
