Persistent Data Layout and Infrastructure for Efficient Selective Retrieval of Event Data in ATLAS
Peter van Gemmeren, David Malon

TL;DR
This paper presents a new persistent data layout strategy for the ATLAS experiment at CERN, significantly improving the efficiency of selective event data retrieval from large-scale, column-wise stored datasets using ROOT.
Contribution
It introduces a novel persistent storage layout for event data that enhances selective retrieval performance in large-scale physics data analysis.
Findings
Improved I/O performance for selective event reading.
Effective use of ROOT's new capabilities for data layout tuning.
Enhanced data retrieval efficiency in high-volume datasets.
Abstract
The ATLAS detector at CERN has completed its first full year of recording collisions at 7 TeV, resulting in billions of events and petabytes of data. At these scales, physicists must have the capability to read only the data of interest to their analyses, with the importance of efficient selective access increasing as data taking continues. ATLAS has developed a sophisticated event-level metadata infrastructure and supporting I/O framework allowing event selections by explicit specification, by back navigation, and by selection queries to a TAG database via an integrated web interface. These systems and their performance have been reported on elsewhere. The ultimate success of such a system, however, depends significantly upon the efficiency of selective event retrieval. Supporting such retrieval can be challenging, as ATLAS stores its event data in column-wise orientation using ROOT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Particle Detector Development and Performance
