A System for Massively Parallel Hyperparameter Tuning
Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz, Hardt, Benjamin Recht, Ameet Talwalkar

TL;DR
This paper introduces ASHA, a scalable and robust hyperparameter optimization algorithm that leverages parallelism and early-stopping, significantly improving efficiency in distributed machine learning workloads.
Contribution
The paper presents ASHA, a novel hyperparameter tuning algorithm that outperforms existing methods and scales linearly with the number of workers in distributed environments.
Findings
ASHA outperforms state-of-the-art methods in hyperparameter tuning.
ASHA scales linearly with the number of workers.
Effective integration of ASHA in production systems for large-scale hyperparameter optimization.
Abstract
Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to develop mature hyperparameter optimization functionality in distributed computing settings. We address this challenge by first introducing a simple and robust hyperparameter optimization algorithm called ASHA, which exploits parallelism and aggressive early-stopping to tackle large-scale hyperparameter optimization problems. Our extensive empirical results show that ASHA outperforms existing state-of-the-art hyperparameter optimization methods; scales linearly with the number of workers in distributed settings; and is suitable for massive parallelism, as demonstrated on a task with 500 workers. We then describe several design decisions we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Bandit Algorithms Research · Advanced Neural Network Applications
