HPC Storage Service Autotuning Using Variational-Autoencoder-Guided   Asynchronous Bayesian Optimization

Matthieu Dorier; Romain Egele; Prasanna Balaprakash; Jaehoon Koo,; Sandeep Madireddy; Srinivasan Ramesh; Allen D. Malony; Rob Ross

arXiv:2210.00798·cs.DC·October 4, 2022

HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization

Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo,, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Rob Ross

PDF

TL;DR

This paper introduces a novel variational-autoencoder-guided asynchronous Bayesian optimization method for autotuning HPC storage services, significantly improving search efficiency and resource utilization.

Contribution

It presents a transfer learning-based approach that enhances autotuning efficiency for HPC storage services using a variational autoencoder and Bayesian optimization.

Findings

01

Achieves over 40x faster search than random search with transfer learning.

02

Outperforms state-of-the-art autotuning frameworks in resource utilization.

03

Demonstrates effectiveness on high-energy physics workflows on Argonne's Theta supercomputer.

Abstract

Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methodstravel james · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings