HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization
Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo,, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Rob Ross

TL;DR
This paper introduces a novel variational-autoencoder-guided asynchronous Bayesian optimization method for autotuning HPC storage services, significantly improving search efficiency and resource utilization.
Contribution
It presents a transfer learning-based approach that enhances autotuning efficiency for HPC storage services using a variational autoencoder and Bayesian optimization.
Findings
Achieves over 40x faster search than random search with transfer learning.
Outperforms state-of-the-art autotuning frameworks in resource utilization.
Demonstrates effectiveness on high-energy physics workflows on Argonne's Theta supercomputer.
Abstract
Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
