# Data management for distributed computational workflows: An iRODS-based setup and its performance

**Authors:** Mohamad Hayek, Martin Golasowski, Stephan Hachinger, Rubén J. García-Hernández, Johannes Munke, Gabriel Lindner, Kateřina Slaninová, Philipp Tunka, Vít Vondrák, Dieter Kranzlmüller, Jan Martinovič

PMC · DOI: 10.1371/journal.pone.0340757 · PLOS One · 2026-01-12

## TL;DR

This paper evaluates the iRODS system for managing data in distributed computing workflows and finds it effective for high-performance data transfers.

## Contribution

The paper demonstrates iRODS' suitability for federated computing infrastructures with optimized data transfers and web-based authentication.

## Key findings

- Efficient network bandwidth utilization is achievable with proper client configuration and file size.
- iRODS integrates well with federated systems using OpenID Connect and online services.
- Optimization opportunities exist for cross-site data transfers in distributed workflows.

## Abstract

Modern data-management frameworks promise a flexible and efficient management of data and metadata across storage backends. However, such claims need to be put to a meaningful test in daily practice. We conjecture that such frameworks should be fit to construct a data backend for workflows which use geographically distributed high-performance and cloud computing systems. Cross-site data transfers within such a backend should largely saturate network bandwidth, in particular when parameters such as buffer sizes are optimized. To explore this further, we evaluate the “integrated Rule-Oriented Data System” iRODS with EUDAT’s B2SAFE module as data backend for the “Distributed Data Infrastructure” within the LEXIS Platform for complex computing workflow orchestration and distributed data management. The focus of our study is on testing our conjectures—i.e., on construction and assessment of the data infrastructure and on measurements of data-transfer performance over the wide-area network between two selected supercomputing sites connected to LEXIS. We analyze limitations and identify optimization opportunities. Efficient utilization of the available network bandwidth is possible and depends on suitable client configuration and file size. Our work shows that systems such as iRODS nowadays fit the requirements for integration in federated computing infrastructures involving web-based authentication flows with OpenID Connect and rich on-line services. We are continuing to exploit these properties in the EXA4MIND project, where we aim at optimizing data-heavy workflows, integrating various systems for managing structured and unstructured data.

## Full-text entities

- **Genes:** GRIN2B (glutamate ionotropic receptor NMDA type subunit 2B) [NCBI Gene 2904] {aka DEE27, EIEE27, GluN2B, MRD6, NMDAR2B, NR2B}, GRIN1 (glutamate ionotropic receptor NMDA type subunit 1) [NCBI Gene 2902] {aka DEE101, GluN1, MRD8, NDHMSD, NDHMSR, NMD-R1}
- **Diseases:** DDI (MESH:D020243)
- **Chemicals:** B2HANDLE (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12795369/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12795369/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/PMC12795369/full.md

---
Source: https://tomesphere.com/paper/PMC12795369