SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations
Ali Mohammed, Florina M. Ciorba

TL;DR
This paper introduces SiL, a control-theoretic approach for dynamically selecting loop scheduling techniques to optimize scientific application performance on heterogeneous HPC systems under various perturbations.
Contribution
It proposes a novel SiL method that dynamically adapts DLS techniques considering multiple perturbations, improving performance over static approaches.
Findings
SiL outperforms static DLS techniques in most scenarios.
Perturbations in network bandwidth and latency significantly impact application performance.
No single DLS technique is optimal for all perturbation conditions.
Abstract
Scientific applications consist of large and computationally-intensive loops. Dynamic loop scheduling (DLS) techniques are used to load balance the execution of such applications. Load imbalance can be caused by variations in loop iteration execution times due to problem, algorithmic, or systemic characteristics (also, perturbations). The following question motivates this work: "Given an application, a high-performance computing (HPC) system, and both their characteristics and interplay, which DLS technique will achieve improved performance under unpredictable perturbations?" Existing work only considers perturbations caused by variations in the HPC system delivered computational speeds. However, perturbations in available network bandwidth or latency are inevitable on production HPC systems. Simulator in the loop (SiL) is introduced, herein, as a new control-theoretic inspired approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
