Building A High Performance Parallel File System Using Grid Datafarm and ROOT I/O
Y. Morita (1), H. Sato (1), Y. Watase (1), O. Tatebe (2), S. Sekiguchi, (2), S. Matsuoka (3), N. Soda (4), A. Dell'Acqua (5)

TL;DR
This paper presents a high-performance parallel file system leveraging Grid Datafarm and ROOT I/O, enabling efficient petabyte-scale data management and transfer for large-scale physics experiments.
Contribution
It introduces a modular architecture integrating ROOT I/O with Grid Datafarm, demonstrating scalable data processing and transfer for high-energy physics data analysis.
Findings
Achieved over 2.3 Gbps data transfer rate across clusters.
Generated 10^6 simulated events using 512 CPUs.
Successfully integrated ROOT I/O with Grid Datafarm for distributed data management.
Abstract
Sheer amount of petabyte scale data foreseen in the LHC experiments require a careful consideration of the persistency design and the system design in the world-wide distributed computing. Event parallelism of the HENP data analysis enables us to take maximum advantage of the high performance cluster computing and networking when we keep the parallelism both in the data processing phase, in the data management phase, and in the data transfer phase. A modular architecture of FADS/ Goofy, a versatile detector simulation framework for Geant4, enables an easy choice of plug-in facilities for persistency technologies such as Objectivity/DB and ROOT I/O. The framework is designed to work naturally with the parallel file system of Grid Datafarm (Gfarm). FADS/Goofy is proven to generate 10^6 Geant4-simulated Atlas Mockup events using a 512 CPU PC cluster. The data in ROOT I/O files is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Parallel Computing and Optimization Techniques
