SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization
Jeff Kinnison, Nathaniel Kremer-Herman, Douglas Thain, Walter, Scheirer

TL;DR
SHADHO is a scalable, hardware-aware framework for distributed hyperparameter optimization that improves efficiency by considering hardware heterogeneity and search space complexity, leading to better model performance.
Contribution
Introduces SHADHO, a novel framework that dynamically assigns hyperparameter search tasks to heterogeneous hardware based on complexity and performance metrics.
Findings
Achieves double the throughput of standard methods on SVM for MNIST.
Discovered 515 better-performing U-Net models in a week using 74 GPUs.
Effectively balances search across heterogeneous hardware environments.
Abstract
Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in part to hyperparameters: user-configured values that control a model's ability to learn from data. Existing hyperparameter optimization methods are highly parallel but make no effort to balance the search across heterogeneous hardware or to prioritize searching high-impact spaces. In this paper, we introduce a framework for massively Scalable Hardware-Aware Distributed Hyperparameter Optimization (SHADHO). Our framework calculates the relative complexity of each search space and monitors performance on the learning task over all trials. These metrics are then used as heuristics to assign hyperparameters to distributed workers based on their hardware.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConcatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Convolution · U-Net · Support Vector Machine
