Grid Data Management in Action: Experience in Running and Supporting Data Management Services in the EU DataGrid Project
Heinz Stockinger, Flavia Donno, Erwin Laure, Shahzad Muzaffar, Peter, Kunszt, Giuseppe Andronico, Paul Millar

TL;DR
This paper discusses the implementation, refinement, and operational experience of data management services in the EU DataGrid project, highlighting lessons learned and future architecture plans for improved data handling in grid computing.
Contribution
It presents the design, deployment, and user feedback of the first-generation Data Management Services in the EU DataGrid project, informing future improvements.
Findings
Services have been refined for robustness and pre-production readiness.
Successful integration with other EDG components and partner projects.
Lessons learned inform the architecture of next-generation data management services.
Abstract
In the first phase of the EU DataGrid (EDG) project, a Data Management System has been implemented and provided for deployment. The components of the current EDG Testbed are: a prototype of a Replica Manager Service built around the basic services provided by Globus, a centralised Replica Catalogue to store information about physical locations of files, and the Grid Data Mirroring Package (GDMP) that is widely used in various HEP collaborations in Europe and the US for data mirroring. During this year these services have been refined and made more robust so that they are fit to be used in a pre-production environment. Application users have been using this first release of the Data Management Services for more than a year. In the paper we present the components and their interaction, our implementation and experience as well as the feedback received from our user communities. We have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Peer-to-Peer Network Technologies
