Reproducible Cross-border High Performance Computing for Scientific Portals
Kessy Abarenkov, Anne Fouilloux, Helmut Neukirchen, Abdulrahman Azab

TL;DR
This paper presents a method for enabling reproducible, cross-border access to high-performance computing resources through scientific portals, using containerization and automation to ensure consistency and ease of use.
Contribution
It introduces a framework that allows scientific portals to access remote HPC and cloud resources with container-based reproducibility and automation, facilitating cross-border scientific collaboration.
Findings
Successful integration of remote HPC resources into scientific portals.
Enhanced reproducibility through containerized software environments.
Improved user-friendliness for scientists accessing distributed HPC resources.
Abstract
To reproduce eScience, several challenges need to be solved: scientific workflows need to be automated; the involved software versions need to be provided in an unambiguous way; input data needs to be easily accessible; High-Performance Computing (HPC) clusters are often involved and to achieve bit-to-bit reproducibility, it might be even necessary to execute the code on a particular cluster to avoid differences caused by different HPC platforms (and unless this is a scientist's local cluster, it needs to be accessed across (administrative) borders). Preferably, to allow even inexperienced users to (re-)produce results, all should be user-friendly. While some easy-to-use web-based scientific portals support already to access HPC resources, this typically only refers to computing and data resources that are local. By the example of two community-specific portals in the fields of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Distributed and Parallel Computing Systems
