Towards Experiment Execution in Support of Community Benchmark Workflows for HPC

Gregor von Laszewski; Wesley Brewer; Sean R. Wilkinson; Andrew Shao; J.P. Fleischer; Harshad Pitkar; Christine R. Kirkpatrick; Geoffrey C. Fox

arXiv:2507.22294·cs.DC·July 31, 2025

Towards Experiment Execution in Support of Community Benchmark Workflows for HPC

Gregor von Laszewski, Wesley Brewer, Sean R. Wilkinson, Andrew Shao, J.P. Fleischer, Harshad Pitkar, Christine R. Kirkpatrick, Geoffrey C. Fox

PDF

TL;DR

This paper introduces workflow templates for HPC benchmarks, enhancing experiment execution and adaptability across scientific applications, validated through two independent tools and diverse use cases.

Contribution

It proposes adaptable workflow templates for HPC benchmarks, validated by two tools, improving experiment management and supporting community-driven workflows.

Findings

01

Workflow templates improve experiment adaptability in HPC.

02

Validated tools support diverse scientific applications.

03

Focus on simple experiment management enhances educational use.

Abstract

A key hurdle is demonstrating compute resource capability with limited benchmarks. We propose workflow templates as a solution, offering adaptable designs for specific scientific applications. Our paper identifies common usage patterns for these templates, drawn from decades of HPC experience, including recent work with the MLCommons Science working group. We found that focusing on simple experiment management tools within the broader computational workflow improves adaptability, especially in education. This concept, which we term benchmark carpentry, is validated by two independent tools: Cloudmesh's Experiment Executor and Hewlett Packard Enterprise's SmartSim. Both frameworks, with significant functional overlap, have been tested across various scientific applications, including conduction cloudmask, earthquake prediction, simulation-AI/ML interactions, and the development of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.