SIM-SITU: A Framework for the Faithful Simulation of in-situ Workflows
Valentin Honor\'e (CC-IN2P3), Tu Mai Anh Do, Lo\"ic Pottier, Rafael, Ferreira da Silva (ORNL), Ewa Deelman, Fr\'ed\'eric Suter (CC-IN2P3)

TL;DR
SIM-SITU is a flexible simulation framework built on SimGrid that accurately models in-situ workflows, enabling performance evaluation of resource allocation and mapping strategies without costly real-world testing.
Contribution
The paper introduces SIM-SITU, a novel, faithful simulation framework for in-situ workflows that improves evaluation accuracy over simplified models and reduces resource consumption.
Findings
Effective simulation of in-situ workflows demonstrated
Impact of allocation and mapping strategies analyzed
Framework validated with molecular dynamics case study
Abstract
The amount of data generated by numerical simulations in various scientific domains such as molecular dynamics, climate modeling, biology, or astrophysics, led to a fundamental redesign of application workflows. The throughput and the capacity of storage subsystems have not evolved as fast as the computing power in extreme-scale supercomputers. As a result, the classical post-hoc analysis of simulation outputs became highly inefficient. In-situ workflows have then emerged as a solution in which simulation and data analytics are intertwined through shared computing resources, thus lower latencies. Determining the best allocation, i.e., how many resources to allocate to each component of an in-situ workflow; and mapping, i.e., where and at which frequency to run the data analytics component, is a complex task whose performance assessment is crucial to the efficient execution of in-situ…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
