# Simulating Data Access Profiles of Computational Jobs in Data Grids

**Authors:** Volodimir Begy, Joeri Hermans, Martin Barisits, Mario Lassnig, Erich, Schikuta

arXiv: 1902.10069 · 2019-03-14

## TL;DR

This paper introduces a new grid computing simulator focused on diverse data access profiles, validated with CERN's WLCG data, enabling better modeling of data-intensive job performance.

## Contribution

A novel simulator for grid computing that models various data access profiles and is calibrated and validated with real-world CERN data.

## Key findings

- The simulator accurately replicates real WLCG workload performance.
- Calibration using likelihood-free MCMC effectively aligns simulator parameters with actual system behavior.
- The approach helps optimize data access strategies to reduce job waiting times.

## Abstract

The data access patterns of applications running in computing grids are changing due to the recent proliferation of high speed local and wide area networks. The data-intensive jobs are no longer strictly required to run at the computing sites, where the respective input data are located. Instead, jobs may access the data employing arbitrary combinations of data-placement, stage-in and remote data access. These data access profiles exhibit partially non-overlapping throughput bottlenecks. This fact can be exploited in order to minimize the time jobs spend waiting for input data. In this work we present a novel grid computing simulator, which puts a heavy emphasis on the various data access profiles. The fundamental assumptions underlying our simulator are justified by empirical experiments performed in the Worldwide LHC Computing Grid (WLCG) at CERN. We demonstrate how to calibrate the simulator parameters in accordance with the true system using posterior inference with likelihood-free Markov Chain Monte Carlo. Thereafter, we validate the simulator's output with respect to an authentic production workload from WLCG, demonstrating its remarkable accuracy.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.10069/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1902.10069/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/1902.10069/full.md

---
Source: https://tomesphere.com/paper/1902.10069