Resource-Adaptive Successive Doubling for Hyperparameter Optimization with Large Datasets on High-Performance Computing Systems
Marcel Aach, Rakesh Sarma, Helmut Neukirchen, Morris Riedel, Andreas Lintermann

TL;DR
This paper introduces RASDA, a resource-adaptive hyperparameter optimization algorithm that efficiently scales on large HPC systems, outperforming existing methods in runtime while maintaining or improving solution quality on large datasets.
Contribution
The paper proposes RASDA, a novel resource-adaptive successive doubling algorithm that enhances hyperparameter optimization scalability and efficiency on large-scale HPC systems.
Findings
RASDA outperforms ASHA by up to 1.9x in runtime.
RASDA maintains or improves the quality of final models.
First application of systematic HPO on terabyte-scale scientific data.
Abstract
On High-Performance Computing (HPC) systems, several hyperparameter configurations can be evaluated in parallel to speed up the Hyperparameter Optimization (HPO) process. State-of-the-art HPO methods follow a bandit-based approach and build on top of successive halving, where the final performance of a combination is estimated based on a lower than fully trained fidelity performance metric and more promising combinations are assigned more resources over time. Frequently, the number of epochs is treated as a resource, letting more promising combinations train longer. Another option is to use the number of workers as a resource and directly allocate more workers to more promising configurations via data-parallel training. This article proposes a novel Resource-Adaptive Successive Doubling Algorithm (RASDA), which combines a resource-adaptive successive doubling scheme with the plain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Machine Learning and Data Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Hyper-parameter optimization
