Twelve Ways to Build CMS Crossings from ROOT Files
D. Chamont, C. Charlot

TL;DR
This paper evaluates different methods for efficiently reading and processing CMS raw data from ROOT files, focusing on performance optimization for pseudo-random event access in large datasets.
Contribution
It compares ROOT containers and STL vectors, identifying the most performant strategies for handling complex, non-standard objects in CMS data simulation.
Findings
Cloning within trees offers the best performance but is complex to tune.
STL vectors could achieve similar performance in future ROOT versions.
Performance varies significantly depending on the data access pattern.
Abstract
The simulation of CMS raw data requires the random selection of one hundred and fifty pileup events from a very large set of files, to be superimposed in memory to the signal event. The use of ROOT I/O for that purpose is quite unusual: the events are not read sequentially but pseudo-randomly, they are not processed one by one in memory but by bunches, and they do not contain orthodox ROOT objects but many foreign objects and templates. In this context, we have compared the performance of ROOT containers versus the STL vectors, and the use of trees versus a direct storage of containers. The strategy with best performances is by far the one using clones within trees, but it stays hard to tune and very dependant on the exact use-case. The use of STL vectors could bring more easily similar performances in a future ROOT release.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Scientific Computing and Data Management
